Writing Good Exam Question

78
Writing Good Exam Questions A Self-study Workbook Written by Dr Kate Exley FOR TRAINING PURPOSES ONLY Produced by the Staff and Educational Development Unit, March 2010 (minor revisions made August 2012) Everyone deserves to be inspired!

description

writing good exam questions

Transcript of Writing Good Exam Question

Page 1: Writing Good Exam Question

Writing Good Exam Questions

A Self-study Workbook

Written by Dr Kate Exley

FOR TRAINING PURPOSES ONLY

Produced by the Staff and Educational Development Unit, March 2010 (minor revisions made August 2012)

Everyone

deserves to be

inspired!

Page 2: Writing Good Exam Question

Writing Good Exam Questions

A Self-study Workbook

Contents Page

List of Figures and List of Tables 1

1. Introduction and Purposes of the workbook – Intended learning Outcomes

2

2. LSHTM Exam requirements 4

3. The underlying principles of good question design 6

4. Aligning exam questions and specimen answers with intended learning outcomes

10

5. What kinds of knowledge and skills can be tested in examinations? 13

6. Reducing the impact of factors such as stress, interpretation, time 19

7. Different styles and formats for exam questions 21

8. Evaluating Draft Questions 29

9. Marking Approaches : Using assessment criteria and marking schemes 33

10. Ways of producing accurate and clear marking guidance for questions 39

11. Ways of producing specimen answers suitable for distribution to students

46

12. Exam question development, validation and approval processes 47

13. Security issues and appeals 48

14. Providing support and guidance for students 49

15. Concluding Remarks 54

Further Reading suggestions 56

Appendices

Appendix 1

Appendix 2

Appendix 3

58

58

60

63

Page 3: Writing Good Exam Question

Writing Good Exam Questions

- 1 -

List of Figures Page

Figure 1 Diagram to illustrate the principles of Constructive Alignment in module design

10

Figure 2 Bloom‟s Taxonomy of Cognition – Revisited by Anderson & Krathwohl (2001)

15

Figure 3 A Normal Distribution or bell-shaped curve 34

List of Tables Page

Table 1 A table of suggested verbs mapped against the Anderson and Krathwohl adapted levels of Bloom‟s Taxonomy of Cognition

16

Table 2 Ways in which intellectual skills can be tested through different question stems

17

Table 3 Some Common Essay Style Questions used in Exams 25

Table 4 LSHTM Marking Gradepoints descriptions (Overarching criteria) 37

Table 5 Examples of grade conversions used at the School 38

Page 4: Writing Good Exam Question

Writing Good Exam Questions

- 2 -

1. Introduction and Purposes of the workbook 1.1 The intended reader This workbook is intended to support colleagues at The London School of Hygiene and Tropical Medicine as they seek to write appropriate Masters level examination questions and their accompanying assessment criteria and marking guidance. The booklet aims to provide clear guidance on what is expected and through the use of examples and exercises, enable colleagues to test out their own exam question writing. It is primarily intended for those who are new to writing examination questions although more experienced colleagues may find it useful as a reference or updating source. The workbook can be used in a number of ways. For those unable to attend the „Writing Better Exam Questions‟ staff development workshop, it can act as a distance learning resource that can be worked through systematically or it can be quickly consulted to check and review current practice. For those using the workshop in the distance learning mode the anticipated learning outcomes are –

To be familiar with the structure and format of LSHTM examinations

To be able to apply the principles of constructive alignment

To critique the different kinds of knowledge and skills that can be assessed through the examination format

To consider how to ensure that questions are fair and equally accessible to all students. (For example the layout and design of the question on the page)

To prepare appropriate marking guidance

To be familiar with question approval and validation processes in the School 1.2 The underlying principle John Bigg‟s theory of Constructive Alignment is used to provide the underpinning framework for question design here. The different kinds of knowledge and skills that can be tested appropriately through a written examination method are discussed as is the need for questions to give equal and fair opportunity to all students. As many students at The School are from overseas and may not have English as a first language there are a number of factors to consider when writing clear and unambiguous questions that do not unintentionally favour particular student groups. A range of different question formats are reviewed and critiqued and a number of ways of quality assuring draft questions are suggested and explained. However, writing the exam question is only half the story – producing the associated marking guidance is also considered here. Marking guidance can take a number of different forms ranging from specimen or model answers through to descriptive criteria and detailed marking schemes matched to necessary answer content. These too will be discussed with reference to examples from The School.

Page 5: Writing Good Exam Question

Writing Good Exam Questions

- 3 -

Aims of assessment at LSHTM

“For all LSHTM courses, the overall aim of assessment is to

facilitate the learning of important elements in the course

and to test that the student has reached the minimum standard

acceptable for the award.”

LSHTM Assessment Code of Practice

(January 2012)

Page 6: Writing Good Exam Question

Writing Good Exam Questions

- 4 -

2. LSHTM Exam requirements London-based MSc Course June exams: There are two three-hour written examination papers taken in June. Together these two papers contribute 30% to the final assessment (15% each). Paper 1 examines the content of Term 1 teaching. It usually comprises questions from each of the core/linear modules taken in Term 1. Thus the same questions for a particular module will appear on several MSc Course exam Paper 1s. Design and marking of these questions is co-ordinated by the teaching module organiser together with other teaching module staff. Paper 2 tests candidates‟ ability to integrate the knowledge and skills acquired during the whole of the MSc course. Paper 2 was originally developed in the mid-1990s after full implementation of the present teaching module structure. As a whole, Paper 2 should be examining the key knowledge/skills which a candidate graduating with an MSc in X should have. In devising Paper 2, MSc Exam Boards should reflect on the intended learning outcomes for the MSc –some of which are likely to require assessment in this exam (others might have been assessed in compulsory study modules the project etc). MSc intended learning outcomes can be found in the MSc Course Handbook, prospectus etc. Questions should require integration of knowledge/skills acquired in different parts of the MSc course – they might use material from compulsory modules but not optional ones that only some of the class might take. Distance learning PGDip/MSc June exams: Most Distance Learning (DL) modules have a 2-hour exam covering the content of that module and this contributes 70-100% of the module‟s mark. MSc EPI and CT also have a 3-hour integrating paper (E400), akin to Paper 2 above, which candidates sit in their final year of the course. Exam questions are usually co-ordinated by the Module organiser or other designated members of staff. To provide guidance and formative support on the examination process DL exams are compiled each year into two 'examiners reports' for each MSc course - one for core modules and one for advanced modules. The reports give the complete exam papers together with a guide on how the questions should have been answered. DL students are sent reports from the 3 previous years.

All module exams taken will count towards the degree, save only where a student has been assessed on more modules than are required – in which instance the Exam Board will determine whether an award may be given, and which modules are counted towards it.

Page 7: Writing Good Exam Question

Writing Good Exam Questions

- 5 -

The School‟s Assessment Code of Practice describes six assessment objectives that should be kept in mind when writing examination questions and designing assessments. These are – Objectives of assessment at LSHTM

Identify whether each student has attained a minimum

level of achievement necessary to pass the course and

identify those who fail to achieve that level.

Note - intended learning outcomes should set out the minimum standard of learning required for the award and assessment should be designed to provide students with the opportunity to demonstrate that they have achieved or exceeded that standard (refer to Section 4 below).

Focus learning on the important aspects of each course.

Note – In attempting to increase the difficulty of assessment tasks set it is important to aim to assess more deeply rather than more widely, ie to avoid focussing on peripheral and less consequential details (refer to Section 3 and 9 below).

Provide feedback on performance so that learning may

improve.

Note – The goal is that students are able to learn through the process of being assessed and view assessment as part of their learning process – this requires that any feedback provided is designed to feedforward (refer to Section 12 below).

Provide a means of encouragement.

Note – It is important to remember that nothing is more motivating than success and students need to be able to see the progress they are making and build their confidence as achievers (refer to Section 11 below).

Interfere as little as possible with important, but

ungraded, aspects of a student‟s educational

experience.

Note - We define learning outcomes but individuals will learn other things whilst studying and unintended outcomes can be equally valuable (refer to Section 5 below).

Identify those students achieving the highest standards

so that they can be considered for a Distinction.

Note –Designing exam questions, assessment criteria and marking guidance that allow differentiation and enable good students to get high marks (refer to Section 10 below).

Page 8: Writing Good Exam Question

Writing Good Exam Questions

- 6 -

3. The underlying principles of good question design

The goal – Test items should be really difficult for people who don't understand the subject material, but they should be straightforward for those who do. If an item is difficult because of complicated wording (e.g., double negatives) or vocabulary, you will end up testing language skills rather than ability in the discipline.

The principles that underlie good question design are –

i. Clarity ii. Reliability, iii. Validity iv. Authenticity v. Fairness

Thinking about each in turn – i) Clarity

“Nothing in the content or structure of [a test] item should prevent an informed student from responding correctly.” Gronlund (1998)

The clarity of an exam question may be compromised by unclear test instructions, confusing and ambiguous terminology, overly verbose and complicated vocabulary and/or sentence structure plus unnecessary and distracting detail (Gay, L.R., & Airasian, P. 2000). The layout of a question is also very important in conveying clarity – particularly in longer, multi-sectioned or data handling styled questions.

Page 9: Writing Good Exam Question

Writing Good Exam Questions

- 7 -

Note – some dyslexic students have a tendency to mis-read or miss completely a short second line of text or additional comment or second part of a question such as, e.g. “What will be the outcome of adding further sodium chloride at this point?

Explain your answer.” In an interview, a dyslexic student spoke of this second line „being hidden away‟ and he had developed ways of re-reading questions to try and avoid this happening to him, however, the layout of a question may add to this problem, e.g. the indent here, on the second line may make it „more hidden‟.

EXERCISE Testing for clarity – contrast the following versions of the same exam question (essay format answer required): Version A:

Public health policy in the United Kingdom underwent a number of significant changes during the Twentieth Century that can be directly attributed to the needs and exigencies brought about by international conflict. Some of the changes and developments that resulted to health systems and service delivery are still with us today and it is important that we understand the background of circumstances that influenced the decisions that were made. Provide a short analysis charting what you consider to be the main transitions in public health policy brought about by the unique needs and challenges, both direct and indirect, of an environment of international conflict, within the UK health systems specifically, using the Second World War as an example. Version B:

Compare the advances in UK public health policy pre- and post-Second World War. Think about points such as:

unclear test instructions,

confusing and ambiguous terminology,

being overly verbose,

using complicated vocabulary,

difficult or poor sentence structure,

unnecessary and distracting detail.

Page 10: Writing Good Exam Question

Writing Good Exam Questions

- 8 -

ii) Reliability Does the question allow markers to grade it consistently and reproducibly and does it allow markers to discriminate between different levels of performance? This frequently depends on the quality of the marking guidance and clarity of the assessment criteria. It may also be improved through providing markers with training and opportunities to learn from more experienced assessors. The likelihood of eliciting an accurate measure of a student‟s ability will be increased when students are provided with a variety of ways to demonstrate their knowledge and skills. For example, some students might generally do better on exams whilst other students do better in their coursework. Including both, in a course will accommodate those differences between students – however as the DL courses are provided through the University of London‟s external programme, that restricts the mode of most assessment to examinations, this may not be an option for all Module leaders. However, even within a written examination we can include a variety of question formats that can help to „triangulate‟ and cater to a student‟s abilities and provide a more reliable measure of their attainments. iii) Validity A valid examination question measures achievement of the intended learning outcomes of the module/unit module (not just what is easy to measure!). The form of the examination question may also be of importance in ensuring validity. For example, examination questions that are short answer questions are a good way of assessing greater breadth of material covered in a course but tend to focus on testing attainment of knowledge and application of knowledge. Whilst longer essay style questions allow a more in depth exploration of subject material and require a candidate to build and structure an argument or explain a complex concept with wide reference to examples and readings. If these aspects are important they should be clearly described in the learning outcomes and be transparent in the assessment criteria for the assessment to be valid. iv) Authenticity Authenticity is the need to match the style and approach of question setting to the reality of practice. This is particularly important when considering the assessment of Masters level qualifications frequently taken by mature students who are accustomed to working within a professional context. A general example might be, rather than set an essay style question, ask students to present their understanding in the style of a professional, or industrial, or clinical report. This may be very important when considering the testing of „procedural knowledge‟ or „functioning knowledge‟ (please see 5.1). When the exam seeks to test a candidate‟s knowledge of how something works, the order or sequencing of events, the interplay between contributing factors etc – it can be very important to ensure this is built into the question formatting and context setting – to allow authenticity.

Page 11: Writing Good Exam Question

Writing Good Exam Questions

- 9 -

Example A learning outcome for a module is – “..will be able to design survey questionnaires to gather quantitative and qualitative data in the field.” An examination question to test this procedural (knowing how to do something) kind of knowledge (rather than memorisation of facts) could, for example, provide the students with a sample survey questionnaire and ask them to give feedback on it – e.g. point out weaknesses in its design or suggest improvements and explain how it could be administered in the field.

v) Fairness You need to give students a fair chance to demonstrate what they know and can do and to be able to succeed in examinations. Fairness can be facilitated by being very clear about expectations in student performance, providing examples of past examination papers, giving opportunities for students to practice and gain „exam technique‟ (through „mocks‟ for example), plus transparency in the processes and criteria that will be used to mark and grade their work. Students should know what is expected of them in order to obtain a particular grade and their marks should be a reflection of their abilities and not a reflection of extraneous and irrelevant factors such as gender, disability etc. Providing a „level playing field‟ is the aim and this is particularly important at The School when considering the different groups of students who come to study or embark upon DL courses, e.g. non-native English speakers, students who have previously experienced very different educational cultures, mature professionals etc.

Page 12: Writing Good Exam Question

Writing Good Exam Questions

- 10 -

4. Aligning exam questions (and specimen answers) with intended

learning outcomes* Constructive alignment is the term coined by John Biggs to describe a coherent approach to ensure that the learning outcomes, teaching and learning methods and the assessment for a unit of study are all directing student learning in the same direction. Figure 1. Diagram to illustrate the principles of Constructive Alignment in module design.

* „Learning Outcomes‟ is the preferred terminology given in the QAA Codes of Practice however learning goals may also be described as learning objectives in some documentation.

What do you want your students to learn?

Aims and Learning Outcomes

How will you help your students to learn it?

Teaching and Learning Methods Learner Support and Guidance

How will you know how well they have learnt it?

Assessment Methods and Criteria

How do you

know any of it is working?

Module

Evaluation

Page 13: Writing Good Exam Question

Writing Good Exam Questions

- 11 -

An excellent place to start when writing an exam question is to go back to the Learning Outcomes for the course or module. These should describe what it is that you want your students to know about or be able to do at the end of the course.

Example At the end of the module students should be able to select an appropriate method and use it to test the significance of collected data.

The learning outcome clarifies what opportunities need to be built into a test question and ensure that the test is valid. For the learning outcome given above – students should be expected to select a method and have the scope to be able to apply the method to some data and finally to be able to comment on the significance or otherwise of the data. To further clarify it would be beneficial to demarcate these three different tasks within the question itself,

Example

• „selection of appropriate method‟, • „using the method‟ and • „interpreting the significance of the findings‟

perhaps as separate question sections a, b, and c and finally, for each to have a clear allocation of the total marks for the question.

Considering scale and scope Examination questions should also aim to indicate to the students how much is required of them to achieve a good mark – the scale and scope of their expected answers. One common way of doing this is to give a time limit (10 questions in 20 minutes) or you could limit the amount of space in the answer booklet or on-line pro-forma provided. Alternatively you can set a maximum word limit for responses. In addition to these „structured‟ ways of indicating the length of answers expected – question setters can include „Boundaries‟ in their questions, such as

Within the limits e.g. “Between 2001 and 2005…”

Page 14: Writing Good Exam Question

Writing Good Exam Questions

- 12 -

To what extent e.g. “Using your knowledge of both prokaryotes and eukaryotes…”

Quantities and amounts e.g. “Provide 5 reasons why…”

With reference to e.g. “ With reference to the published research from ..”

EXERCISE For the questions given below - Underline the verb and key elements of the question that give an indication of the extent (limits and boundaries) of the question. Do you feel these are appropriate for Masters level study? 1. Describe the three main methods of economic evaluation (40%). What are the main strengths and weaknesses of each method? (40%). Support your answer with examples of disease evaluation (20%) 2. A recent retrospective analysis of health records in the Gambia has suggested that the incidence of malaria has fallen dramatically in that country over the last 10 years. The elimination of the disease is beginning to be discussed. The National Malaria Control Programme has begun a surveillance system to detect future changes. What advice would you give the National Malaria Control Programme on how to organize a surveillance system for malaria. Give practical tips for ensuring its quality. 3. Write short notes on THREE of the following. In each case explain the importance of the infectious agent and the mode of transmission in its spread and control. a) rotavirus diarrhoea b) measles c) guinea worm d) dengue e) tuberculosis

Please see Appendix 1. for some feedback comments on this exercise. You may also wish to refer directly to the learning outcomes of your modules and the Master‟s level descriptors in the Qualification Framework document.

Page 15: Writing Good Exam Question

Writing Good Exam Questions

- 13 -

5. What kinds of knowledge and skills can be tested in examinations?

It is possible to test a wide variety of different kinds of knowledge, skills and attitudes through the careful writing of examination questions.

5.1 kinds of knowledge that can be tested e.g. Knowledge domains 5.2 kinds of intellectual skills e.g. Analysing, Evaluating 5.3 kinds of transferable skills e.g. Writing skills, Time use 5.4 kinds of attitude e.g. Ethics, equality

Again taking each of these elements in turn let us first consider the different kinds of Knowledge and ways of knowing that you may wish to test in your students.

“Exam questions should test a range of knowledge and skills at

Masters level. They should test and reward critical

appreciation and the ability to apply what has been learnt

rather than the passive reproduction of memorised facts.”

Assessment Code of Practice, (2012)

5.1. The kinds of knowledge that can be tested – knowledge domains

Factual Knowledge Terminology, facts, figures

Conceptual Knowledge Classification, Principles, Theories, Structures, Frameworks

Procedural Knowledge Algorithms, Techniques and Methods and Knowing when and how to use them.

Metacognitive Knowledge Strategy, Overview, Self Knowledge, Knowing how you know.

Page 16: Writing Good Exam Question

Writing Good Exam Questions

- 14 -

EXERCISE Please consider the following four examination questions and decide what kind of knowledge you feel they would test? 1. What are the key steps and processes in bringing a new anti-cancer drug to

market and introducing it for clinical use? 2. Write short notes on the following –

a. Bacterial pathogenicity b. Neisseria gonorrhoeae c. Group A streptococci

3. Using the tabulated data provided calculate the incidence risk of prostate cancer per 1000 men, per 5 years, at each of the given levels of alchohol consumption.

4. Why do malaria parasites persist in the human population. Explain the choice

of drugs which could be used to prevent persistence of Plasmoduim falciparum and Plasmodium vivax.

5.2. The kinds of intellectual skills that can be tested At Masters level learning outcomes for/modules usually require students to demonstrate higher level cognitive and intellectual skills, ie it is not enough for students to demonstrate that they can remember facts and figures, names and dates; they need to show they are able to interpret the meaning of data and evaluate their significance. Several cognitive psychologists have been interested in categorising the different ways that we can learn and „think‟ about things – the most famous of these being a group led by Benjamin Bloom in the mid 1950s. Bloom et al (1956) identified three different domains of learning, Cognitive (knowledge), Affective (attitudinal) and Psychomotor (manual skills). They went on to produce complex hierarchies of skills for the knowledge and attitudinal domains that have been re-visited and revised by many researchers since. When writing examination questions it can be extremely helpful to consider the Cognitive domain hierarchies in particular. Indeed, thinking carefully about the level of cognition that is to be tested will help to select the most appropriate verb to be used in the exam question.

Page 17: Writing Good Exam Question

Writing Good Exam Questions

- 15 -

e.g. Do we want to test a candidate‟s ability to „list important features‟, „analyse the given findings?‟ or „critique the argument they give‟. Anderson et al‟s (2001) re-working of Bloom‟s taxonomy makes this easier as they chose to present the hierarchy of sub-categories as active verbs – and it is their version particularly that has been widely used in course design and question design in more recent years. It is however important to remember that,

“Although Bloom's lends itself to wide application, each discipline must define the original classifications within the context of their field” Crowe et al (2008)

Figure 2. Bloom's Taxonomy of Cognition – Revisited by Anderson & Krathwohl (2001)

Create Evaluate Analyse Apply Understand Remember

Note – Some colleagues in the School may already be familiar with the original Bloom taxonomy that uses the terms Knowledge, Comprehension, Application, Analysis, Synthesis and Evaluation

Page 18: Writing Good Exam Question

Writing Good Exam Questions

- 16 -

Table 1 A table of suggested verbs mapped against the Anderson and Krathwohl adapted levels of Bloom’s Taxonomy of Cognition

Cognitive Level Verb Examples 1. Remember define, repeat, record, list, recall, name, relate, underline. 2. Understand translate, restate, discuss, describe, recognise, explain, express, identify, locate, report, review, tell. 3. Apply interpret, apply, employ, use, demonstrate, dramatise, practice, illustrate, operate, schedule, sketch. 4. Analyse distinguish, analyse, differentiate, appraise,

calculate, experiment, test, compare, contrast, criticise, diagram, inspect, debate, question, relate, solve, examine, categorise.

5. Evaluate judge, appraise, evaluate, rate, compare, revise, assess, estimate 6. Create compose, plan, propose, design, formulate, arrange, assemble, collect, construct, create, set-up, organise, manage, prepare.

It is easy to see how Bloom‟s very hierarchies become employed in different question stems, for example, see Table 2.

Page 19: Writing Good Exam Question

Writing Good Exam Questions

- 17 -

Table 2. Ways in which intellectual skills can be tested through different question stems.

Intellectual Skill

Stem

Comparing

Describe the similarities and differences between... Compare the following two methods for...

Relating & Effecting

What are the major causes of... What would be the most likely effects of...

Justifying

Which of the following alternatives do you favor and why? Explain why you agree or disagree with the following statement.

Summarising

State the main points included in... Briefly summarize the contents of...

Generalising

Formulate several valid generalizations for the following data. State a set of principles that can explain the following events.

Inferring

In light of this information, what is most likely to happen when... How would person X be likely to react to the following issue?

Classifying

Group the following items according to... What do the following items have in common?

Creating

List as many ways as you can think of for/to... Describe what would happen if...

Applying

Using the principles of X describe how you would solve…. Describe a situation that illustrates the principle of...

Analysing

Describe the reasoning errors in the following paragraph. List and describe the main characteristics of...

Page 20: Writing Good Exam Question

Writing Good Exam Questions

- 18 -

Synthesising

Describe a plan for providing that...

Evaluating

Describe the strengths and weaknesses of...

(Adapted from Figure 7.11 of McMillan (2001) and Piontek, M.E. (2008)) Note – you may like to compare these question stems with Bloom’s taxonomy, given earlier and draw comparisons and to cross refer to the learning outcomes specified for your own Modules.

EXERCISE Take a few moments to look down this list of question stems and select two that you feel could be used to test students on your module/course. Why have you selected these two?

5.3. The kinds of transferable skills that can be tested Short answer and essay styled questions do give an assessor the opportunity to judge a range of generic or transferable skills in the way students answer the questions or respond to the tasks set. The most obvious of these are to do with ability to write clearly and appropriately, to structure and organise answers so that most important points are prioritised and well made and the ability to cite and use source material effectively. If these skills are to be included and given value in the assessment this should be clearly stated in the assessment criteria used to make judgements and this fact should be made clear to students. At The School this is an important issue as many of the Masters students are non-native English speakers. What proportion of the marks for a test question are allocated to skills such as „written English‟ should be related to the Aims and Learning Outcomes for the course and context. In some cases accuracy and style may be considered important, e.g. to highlight professional skills and competencies, and be included in the assessment criteria, whilst in others such characteristics are not what is being taught and considered.

Page 21: Writing Good Exam Question

Writing Good Exam Questions

- 19 -

5.4. The kinds of attitude that can be tested

Attitudinal learning outcomes, such as equality, fairness, ethical considerations etc, may be important learning outcomes in School Masters programmes and as such are appropriate factors to be tested through the examinations. It is a complex area of assessment as one can argue that just because a candidate knows what they should be saying in response to an equality issue, this does not necessarily reflect what they really „feel‟ or how they would react. Examination answers may therefore only be considered as a partial reflection of a candidate‟s attitude. In some cases it may be that it is more straightforward to „assume‟ a student is adhering to the necessary programme attitudes but to penalise (through the grading structure) cases where such attitudes are transgressed. For example, when answers indicate important values are either not fully understood or are not being applied by a candidate, e.g. an answer is unacceptably gendered or racist etc.

6. Reducing the impact of extraneous factors such as stress, interpretation,

time

Ability to work under pressure or to demonstrate „stress tolerance‟ etc are unlikely to be valid learning outcomes for a Masters Programme at The School and therefore all attempts should be made to reduce the impact that stress and nerves may have on a student‟s performance in an examination. It is possible to set and run examinations in ways that limit the importance of stress induced factors (such as memory lapses) on success. Written examinations can be organised as open book exams* or question topics can be pre-seen by candidates. Such strategies reduce the need to „question spot‟ or the impact of „luck‟ in revising the right or wrong sub-selection of topics tested. They allow students to think more deeply about and possibly research, their views before attempting the questions (as with course work assessments) but do have the added advantage of avoiding some of the concerns of plagiarism – in that candidates produce their individual answers under exam conditions. Those familiar with running these types of examination comment that the quality of student answers are frequently judged as a much higher standard (again as is the case with course work answers). * Open book examinations can allow students to take their own notes or choice of texts or previously specified items into the examination.

Page 22: Writing Good Exam Question

Writing Good Exam Questions

- 20 -

If examinations are to be run traditionally as unseen, time constrained tests carried out by individuals in silence – then there are a number of things that the question writer can consider to minimise the impact of such stress factors. For example,

Check that the question does not assume a lot of background knowledge which may be culturally specific or introduce unnecessary bias;

Provide any important (untested) background detail within the body of the question;

Give mark or timing guides within the framing of the question that indicate the relative importance or attached weightings for each sub-section;

Set multiple-part problem questions so that the parts are independent from each other. This means that if a student gets the first part wrong they don‟t automatically lose marks or subsequent sections and makes grading much quicker and more straightforward.

E.g in the second part of a question, write something like “In the next part of the calculation, assume that the answer to Part (a) was 25, regardless of what you actually got in Part (a). Note that 25 is NOT necessarily the correct answer to Part (a).”

EXERCISE Can you think of any additional aspects in the exam questions you will be writing that should be considered to reduce the impact of stress factors? Please list these here. • • •

An extended example (and exercise) is provided in Appendix 2.

Page 23: Writing Good Exam Question

Writing Good Exam Questions

- 21 -

7. Different styles and formats for exam questions There are a number of ways in which examination questions can be written and structured that in turn require very different responses from students. Examination papers may consist of a variety of these formats. For example a paper may consist of an initial section of 10 compulsory, short answer questions followed by a second section in which the student is asked to attempt three from six longer questions which may be essay or case study or problem solving styled questions. Here are some examples of different ways questions are written at the School with a commentary highlighting important features (such as the need to avoid ambiguity, bias, inequality and yet be able to discriminate between different levels of attainment and achievement).

7.1 Objective Tests e.g. True-False, Matching Pairs and Multiple Choice Questions There are few examples of such question types being used extensively in summative assessments at the School and they are included here for completeness sake – and an acknowledgement that some teachers may well be using these question formats as part of their class or on-line teaching, as self assessment or formative assessment opportunities for their students.

“Objective tests require a user to choose or provide a response to a question whose correct answer is pre-determined. Such a question might require a student to • Select a solution from a set of choices (MCQ, true-false, matching, multiple response); • Identify an object or position (graphical hotspot); • Supply brief text responses (text input, word or phrase matching); • Enter numeric text responses (number input); or • Provide a mathematical formula (string evaluation or algebraic comparison).” Pass-It, Good Practice Guide

Page 24: Writing Good Exam Question

Writing Good Exam Questions

- 22 -

True- False Used to test a breadth in knowledge of information but the problem of „guessing‟ is a major worry.

Matching Pairs Used to assess knowledge of complex and inter-connecting relationships.

Multiple Choice Questions - Different Formats There are many different types of MCQs. Some are especially well suited for certain types of content. Some are particularly good for testing higher-order learning. Some are inherently „easier' or „more difficult' than others.

o One-Choice Completions - Best Answer

The most commonly used MCQ format is simply a short-answer question with a number of alternatives to choose from.

o Multiple-Choice Completions

This MCQ format allows for more than one correct answer. Such questions are more difficult since the student is not just looking for one correct response among four incorrect responses. However, the intent of this format is not to test four separate points but rather to set up an interpretive exercise.

o Quantitative and Functional Relationships

An MCQ format that deals with quantitative and/or functional relationships. They are generally best for knowledge testing but can also be used to test higher-order learning outcomes.

There is extensive guidance available on-line (particularly from Universities in the USA) on the construction of multiple-choice tests and some of these are listed in the references provided at the end of this workbook. Traditionally used to test lower order cognitive skills their use in assessing higher order, Masters skills, such as problem solving and analysis, is increasingly being explored. 7.2 Short answer questions can take many forms

„constructed-response‟ or open-ended questions that require students to create an answer. These may be very short and of a „fill in the blank‟ nature or longer, a few sentences or a couple of paragraphs maximum. They can be used to test core knowledge from a module and check the student has the required breadth in understanding.

Calculations and data manipulation questions

Page 25: Writing Good Exam Question

Writing Good Exam Questions

- 23 -

Example -

The investigators want to perform a sample size calculation with 80% power and 5% 2-sided significance. They estimate that HIV-free survival at 7 months will be 60% in the control arm.

(i) Calculate the sample size required to detect a 10% increase in HIV-free survival at 7 months in the intervention versus the control arm. (Hint: remember to identify your equation, define all your variables, show all your calculations and conclude appropriately) (10 marks)

(ii) Assume that 5% of mother–infant pairs are lost to follow up prior to the infant reaching 7 months and adjust your sample size calculation accordingly. (4 marks)

EXERCISE

How could you improve parts (i) and (ii) of the example question above?

Please see the concerns that were raised by the Module team over the page

Page 26: Writing Good Exam Question

Writing Good Exam Questions

- 24 -

Improving the example question

Here are the views of the Module leader who raised two questions relating to the clarity of the draft question

Part (i) - It isn’t clear whether the question is asking students to calculate an absolute or relative increase? This makes a big difference to the calculation (see below). This is an example how the omission of one word can have a significant difference on how the student answers!!!!

It is therefore crucial to use technical terminology precisely and avoid ‘expert shorthand’ that could be mis-leading to a new learner.

i.e.) If we are asking about a Relative increase:

n= F( ,Β) x [ p1 x (100 - p1 ) + p2 x (100 - p2)]/ (p1 - p2 )2

= type 1 error = 0.05 Β = type 2 error = 1- power = 1-.08 =0.2

F( ,Β)= 7.85 p1 = anticipated percentage of Infants in the control group HIV uninfected and alive by 7 months=60% p2= anticipated percentage of infants in intervention group HIV uninfected and alive by 7 months=66%

n = sample size for each group n= 7.85 [(60x40) + (66x34)]/36 =7.85x4644/36=1013 Not accounting for loss to follow up, a sample size of 1013 women per study arm (2026 total) will give us 80% power and 5% significance to detect a 10% increase in HIV free survival in the intervention from 60% in the control arm. i.e.) And asking about an absolute increase: As before: p1 = anticipated percentage of Infants in the control group HIV uninfected and alive by 7 months=60% BUT this time: p2= anticipated percentage of infants in intervention group HIV uninfected and alive by 7 months=70% (ie 60% plus 10%)

n = sample size for each group n= 7.85 [(60x40) + (70x30)]/100 =7.85x 4500/100=353.25 Not accounting for loss to follow up, a sample size of 353 women per study arm (706 total) will give us 80% power and 5% significance to detect a 10% increase in HIV free survival in the intervention from 60% in the control arm

Part (ii) - Will the formula be included in the question or the provided formulae sheet? This is a straightforward calculation for which there is a formula. Do you expect the students to memorise the formula or will they expect it to be provided?

Being clear about what actually should be tested is the important factor here.

Page 27: Writing Good Exam Question

Writing Good Exam Questions

- 25 -

7.3 Longer form Questions - Essay questions Longer format to allow students to respond to open ended questions at length. Used to test higher skills, writing and structuring skills, further reading and a deeper level of understanding. Assessors are frequently interested in a student‟s ability to organise and integrate a range of ideas and information and build an argument or make a case (the intellectual skills of synthesis and evaluation, going back to Bloom‟s taxonomy). Two types of essay questions can be readily identified, restricted-response and extended-response. Restricted-response essays focus on understanding of basic knowledge through relatively brief and confined written responses. e.g. Outline the morphology, genome organisation and replication of the human immunodeficiency virus (HIV). Extended-response essays allow student to construct a variety of interpretations and explanation and draw upon a wider and more flexibly defined set of information and sources e.g. “The burden of disease caused by intestinal parasites in a community reflects the levels of personal and environmental hygiene”. To what extent do you agree with this statement and what are its implications? Make reference to specific infections to support your conclusions. Table 3. Some Common Essay Style Questions used in Exams

Question Stem Give a Quotation – Discuss Make an Assertion – Discuss Compare and Contrast Write-on … Outline… Describe … Explain (with examples) … Evaluate … Analyse the advantages... Design a …

Page 28: Writing Good Exam Question

Writing Good Exam Questions

- 26 -

EXERCISE Look back over recent examination papers set for your course or teaching module and add two more commonly used Question Stems to this list. 1. 2.

7.4 Longer form Questions – Problem Solving / Data handling Here the students are provided with some data (this could be in written, tabulated, graphical form etc) and then asked a series of questions about it. The provided information may be some research findings or monitoring data. The questions usually begin with a couple of straightforward interpretative questions (e.g. Using the table of infection rates provided, which of the described drug therapies reduces the risk of infection the most?). They then move on to more complex questions of application and analysis that require the students to carry out standard manipulations or calculations of the data provided. The final questions are likely to be more evaluative and open-ended, requiring the students to predict likely impacts or suggest improvements etc. An Example On a hot summer day, children in three schools had a school outing to a playground where some of the children played in the recreational fountain. Two days later nearly half the children had symptoms of vomiting, diarrhoea, abdominal pain and headache. A retrospective cohort study was carried out to try to identify the source of the outbreak with the following results.

Risk factor Exposed to risk factor

Not exposed to risk factor

Ill Not ill Ill Not ill

Ate commercial ice-cream 72 76 17 22

Drank water from taps near fountain 3 4 78 89

Drank water from taps near sanitary facility

18 32 68 64

Played in fountain 87 80 4 19

Drank from fountain 25 15 24 75

(a) Define what is meant by the risk and relative risk of becoming ill associated

with each factor (10%).

Page 29: Writing Good Exam Question

Writing Good Exam Questions

- 27 -

(b) Calculate BOTH the risk and relative risk associated with each factor (30%). (c) Suggest possible interpretations of the results, and the implications for

control recommendations (10%).

The investigators wanted to identify the infectious agent involved. One possibility they considered was norovirus which is known to cause acute gastroenteritis. Although reverse transcription-PCR (RT-PCR) method is considered to be the “gold standard” for diagnosis of this viral infection, it requires skilful personnel and a well-equipped laboratory. A simpler diagnostic kit has been developed. The following table shows how the simpler diagnostic kit compares to the gold standard.

Gold standard (RT-PCR)

Diagnostic kit Norovirus present Norovirus absent

Norovirus present 37 3

Norovirus absent 13 47

(d) Would you advise the investigators to use the simpler diagnostic test in their

epidemiological study? Would your recommendations change if the simpler diagnostic test was to be used in clinical practice? Justify your answer. (50%)

[Note on norovirus: this highly infectious RNA virus causes a self-limited, mild to moderate disease that often occurs in outbreaks with clinical symptoms of nausea, vomiting, diarrhoea, abdominal pain, headache, low grade fever or combination of these symptoms. No treatment is indicated apart from rehydration in severe cases. ]

EXERCISE In section 6. we discussed a number of ways that a question writer could minimise the impact extraneous factors, such as stress, interpretation, time-management etc in the way they set a question– please look over the question above and identify at least three ways the question author has sought to do this. 1. 2. 3.

Page 30: Writing Good Exam Question

Writing Good Exam Questions

- 28 -

7.5 Longer form Questions – Case study or Scenario based question In case study styled questions a context or situation is described in detail (e.g. this maybe a patient history or government strategy position etc). Such questions are often seen as being very authentic and ask students to apply their knowledge to a particular and novel, set of circumstances. They frequently take considerable work and effort to write well and usually involve a team of people who craft an idea into a realistic and challenging situation. Note - Some examples of this type of question are presented as examples in section 11. Giving Choice A common structure in examination papers is to have part of the paper „core‟, to be attempted by everybody and other sections which provide a limited amount of choice, e.g. choose 2 from the following list of essay questions to complete. Whilst the structure of exam papers is set by the Board of Examiners and not by individual question setters, it is never-the-less interesting to consider the impact of providing question choice within an exam. Many people view the giving of choice as a way to increase fairness and reduce the affect of „luck in question spotting‟. It allows students to address questions for which they feel most prepared and have been most interested in – so seeing the „best‟ the student can produce. However, providing choice inherently reduces the validity and reliability of the test instrument because each student is in fact taking a different test and has been encouraged to sample from their learning in different ways. It is nearly impossible to create parallel exam questions that test achievement of the learning outcomes to the same extent, and it is equally difficult to grade two different essays absolutely comparably – both factors making consistency very difficult (Piontek, 2008).

EXERCISE

Do you personally think that the giving of a choice in an examination, (e.g. choose 3 from the following 6 questions) is fair?

Page 31: Writing Good Exam Question

Writing Good Exam Questions

- 29 -

8. Evaluating Draft Questions It is very difficult to write a question and then immediately see the ambiguities or errors that it contains. Separating the „creating‟ from the „evaluating‟ roles in time can help. Write a question and then come back to it the following day and re-read with fresh eyes. When you have a draft question, next write a model/specimen answer and/or some marking guidance. As you do this come to a decision about the appropriate break down of marks and try to estimate how long it will take to tackle the question, part by part. In coming up with the marking scheme for your question you might find it helpful to have the learning outcomes for the module or course in sight to refer to so that you can check that you are valuing the right things and giving credit to Master‟s level criteria. Below is a checklist of questions to use once you have a draft question (doing some of this in a group with questions on overheads can work well):

Page 32: Writing Good Exam Question

Writing Good Exam Questions

- 30 -

Checklist for reviewing draft exam questions 1. What is the question intended to measure?

(eg factual recall, data processing/analysis skills, problem-solving skills, policy analysis skills, critical analysis skills)

2. What else does it actually measure?

(eg does it rely too much on factual recall?)

3. Does it measure what we said we would measure? (Is it aligned with the teaching on the course, the content covered and emphasised and the intended learning outcomes?)

4. How well does the question relate to intended learning outcomes (of the

teaching module or MSc)?

5. Is the language simple, clear, unambiguous and straightforward?

6. What are the key words describing the task? Are they clear?(eg: list, define, „suggest reasons behind the effect‟ are better than interpret, discuss, evaluate)

7. Is the language used easy to understand, including by candidates for whom

English is not their first language (eg does it use colloquial phrases)?

8. Check punctuation and grammar as this can markedly change the meaning of sentences (eg “panda eats, leaves and shoots”).

9. Does the question give an advantage or disadvantage to those candidates

with particular professional backgrounds (eg medics)?

10. How reliably can the answers be marked?

11. If the question is in sections, is the division of percent of marks between sections appropriate? Are there consequences for later sections if a candidate makes an error in an early section? If yes, how will the marking cope with this possibility?

12. Can the question be completed in the time available (including reading,

thinking and reviewing time), including those for whom English is not their first language?

13. Does the question lead to answers which will distinguish between weak and

strong candidates, eg are there elements for candidates to demonstrate distinction-level skills/knowledge?

Page 33: Writing Good Exam Question

Writing Good Exam Questions

- 31 -

Question Validation The Masters programme that you contribute too is likely to have its own process of question validation and process of compiling the examination paper. It is important that you ascertain this from the module leader and adhere to it. In general terms, however, once you have the question, model/specimen answer and marking scheme written ask someone else to answer it (do not give them the model/specimen answer), timing each part of the question. It allows you to check that your calculated „time it takes to complete‟ estimates were about right. Modify the question, and timings and marking scheme based on any misunderstanding made clear by their answer. It can be helpful to agree a „question swap‟ with a colleague and undertake an informal peer review of the questions you have both written. This frequently happens across a course team. At this stage you will be ready to submit your question to the module leader and they too will scrutinise your question and may get back to you with further suggested improvements (please see the extended case study in the Appendix for further detail about the way The School conducts examination question approval processes.)

EXERCISE – Evaluating a question Please read the following draft question and suggest improvements – When you have had a go – turn the page and you will see the changes that the examiners team finally made to the question. Question X Draft – Describe the structure of the cell plasma membrane and its principal components. How and where are plasma membranes usually made in the eukaryotic cell. How are molecules transferred across the membrane into and out of the cell :

Water

Ethanol

Sodium and Potassium ions

Sugars. What other functions in the cell may lipids serve?

Over the page you will find the edited version of Question X that was eventually accepted and used in the examination.

Page 34: Writing Good Exam Question

Writing Good Exam Questions

- 32 -

Notes – Exercise – Evaluating some questions

When you submit a question to the Teaching Unit leader it is likely that they will arrange for it to be scrutinised by members of the teaching team and they will make suggestions for improvement. Question X, reviewed on the previous page, ended up looking like this -

Question X accepted – Describe the structure and synthesis of the cell plasma membrane of eukaryotic cells and its principal components. Explain how molecules are transferred across the membrane giving 2 examples.

The questions are usually considered together with the associated marking guidance notes – and for this question these were the guidance notes that were accepted –

The Marking Guidance For Question X Lipid bilayer membrane consisting of amphipathic lipid molecules with the hydrophobic part inside and the hydrophilic part outside. The major components are lipids of various kinds : these may consist of phospholipids (eg phosphatidyl choline, serine,ethanolamine etc.)), triacylglycerols (containing glycerol esterified with saturated or unsaturated fatty acids), glycolipids (eg diacylglycerols with a sugar chain on the third glyceryl OH), sterols or steroids (eg cholesterol) amongst others. May also contain others. Also proteins which may be transmembrane with hydrophobic trans-membrane section or anchored by lipid. Also protein transporters which span membrane and are each responsible for the transport of a limited range of molecules or ions. Most require energy. Membranes are synthesised in the cytosol of the Endoplasmic Reticulum (ER) where the acylation of fatty acids takes place. Lipids inserted into the inner layer of the membrane are flipped by a „scramblase‟ and by „flippases‟ which equilibrate lipids between both sides.Lipids are transported between membranes by phospholipids exchange proteins a) Diffusion (neutral) b) diffusion (lipid-soluble) c) ion transporter d) specific transporter protein Fuel storage (ie triglycerides); signalling; pigment

Page 35: Writing Good Exam Question

Writing Good Exam Questions

- 33 -

9. Marking Approaches:

Using assessment criteria and marking schemes

Assessment criteria test the intended learning outcomes for a course or teaching unit. They describe the knowledge and skills (and possibly attitude) that a student is expected to demonstrate in their examination answers and they are then used in marking the work. The learning outcomes describe what students should be able to do; assessment criteria describe how well they should be able to do it – they set standards. Remember that learning outcomes define the minimum standard required to achieve the award, and so in addition to these the assessment criteria should provide an objective basis for interpreting and differentiating the performance of students at the level of the outcome (a „satisfactory‟ pass) and at a series of pre-defined steps above this (usually up to a level considered an „excellent‟ or „outstanding‟ pass). For each examination question there should be a model/specimen answer, or a set of specific marking guidance, that are used to mark the associated student answers. These will usually vary with each and every question and are tailored and specific. The assessment criteria are usually more generic and used as a framework to fairly judge the merits of each student‟s work across a whole course or teaching unit. Assessment criteria describe the extent to which students have achieved the specified learning outcomes. They are usually provided at two levels,

o the overarching criteria that describe the different bandings of overall

achievement at the Programme level e.g. First, Two-one, Two-two etc at undergraduate level and Pass /Merit/Distinction categories at Masters level.

o A detailed and specific level of criteria that describe and measure

achievement in particular modules of study – or for individual assessment tasks.

Two Different Approaches to Marking Assigning grades fairly and robustly is a demanding occupation for all teachers and we employ a range of approaches to help us to do this reliably and consistently. Two very different methods are often used simultaneously and symbiotically – norm referencing and criteria referencing. Norm referencing is all about comparison – when we want to re-mark an earlier exam question having subsequently marked a stronger or weaker essay – we are norm referencing our assessment. When we say, „is this as good as the other Distinction in the batch of answers?” we are norm referencing.

An ultimate form of norm-referenced assessment is when we attempt to fit our marking profile for a cohort of students to the „bell-shaped‟ curve or give our assessment results a „normal distribution‟. This pattern of achievement anticipates

Page 36: Writing Good Exam Question

Writing Good Exam Questions

- 34 -

that a few students will fail and a similarly few students will get distinctions whilst the majority will gain marks that cluster and peak in the middle mark range. You will also sometimes hear experienced assessors referring to a particular piece of student work as providing a „benchmark‟. This is where the answer provided for various reasons encapsulates the criteria for a mark or grade: for example, determining the threshold for a distinction. This can be extremely helpful, and is a way in which norm referencing and criteria referencing naturally come together.

Figure 3.

A Normal Distribution or bell-shaped curve.

However, not all cohorts will „fit‟ this pattern, for example, Computing for Beginners’ courses could form a two peak pattern, with clusters of students achieving very high marks (and represent the students who could have taught the course!) and a second cluster with marks at the bottom of the range (ie those who had never done any computing before!). Absolute norm referencing also has the characteristic of effectively setting quotas, only so many students can get „A‟s and only so many can get „B‟s etc, and the application of a „bell-shaped curve‟ to small groups or cohorts of students becomes clearly unfair – where we can see that variations between groups, say from year to year, is likely to give rise to very different patterns of achievement. Criterion referenced grading on the other hand – specifies a standard through the description of clear criteria and anybody who achieves the level or standard described gains the marks – so everybody in the cohort could potentially get an „A‟ and each student‟s work is individually judged in comparison to the criteria – regardless of what other students may or may not do.

Page 37: Writing Good Exam Question

Writing Good Exam Questions

- 35 -

EXERCISE Please consider the strengths and limitations of both forms of grading work. Norm-referenced assessment Strengths Weaknesses / limitations Criterion-referenced assessment Strengths Weaknesses / limitations

Page 38: Writing Good Exam Question

Writing Good Exam Questions

- 36 -

In The School‟s Assessment Code of Practice we can see some guidance and clarity on this issue. Using the full mark range – Advice from the School

(Code of Practice 2012)

Markers are encouraged to use the full range of available

marks, to reflect the full range of student achievement. In

particular, markers should not feel reluctant to award 5.0

grades provided work meets the appropriate standards. The

following specific points should be noted –

„Excellent‟ work does not have to be „outstanding‟ or

exceptional by comparison with other students.

Since the School uses criterion-referenced marking rather

than banded marking, 5.0 grades should not be capped to a

limited proportion of students per class.

There is no standard cut-off for what constitutes

„excellent‟ work. In many cases where quantitatively-

scored assessments are used, a 5.0 grade may be awarded

for work scoring above a particular threshold (for

example 80%) of the possible marks, i.e. by no means

perfect but of a sufficiently high standard.

Good assessment design should ensure that tasks have

clear criteria to allow excellent students to achieve 5.0

grades.

LSHTM Grade descriptors for assessed work (Code of Practice 2012) The School uses a standard assessment system, marking against six gradepoints: integers from 0 to 5. Grades 2 and above are pass grades (grade 5 can be seen as equivalent to distinction standard); whilst grades below 2 are fail grades, (these are equivalent to the old grades of A, B+, B, C, D and E).

Page 39: Writing Good Exam Question

Writing Good Exam Questions

- 37 -

Table 4. LSHTM Marking Gradepoints descriptions (Overarching criteria)

Grade

point Descriptor Typical work should include evidence of…

5 Excellent Excellent engagement with the topic, excellent depth of understanding & insight, excellent argument & analysis. Generally, this work will be „distinction standard‟.

NB that excellent work does not have to be „outstanding‟ or exceptional by comparison with other students; these grades should not be capped to a limited number of students per class. Nor should such work be expected to be 100% perfect – some minor inaccuracies or omissions may be permissible.

4 Very good Very good engagement with the topic, very good depth of understanding & insight, very good argument & analysis. This work may be „borderline distinction standard‟.

Note that very good work may have some inaccuracies or omissions but not enough to question the understanding of the subject matter.

3 Good Good (but not necessarily comprehensive) engagement with the topic, clear understanding & insight, reasonable argument & analysis, but may have some inaccuracies or omissions.

2 Satisfactory Adequate evidence of engagement with the topic but some gaps in understanding or insight, routine argument & analysis, and may have some inaccuracies or omissions.

1 Unsatisfactory / poor

(fail)

Inadequate engagement with the topic, gaps in understanding, poor argument & analysis.

0 Very poor (fail)

Poor engagement with the topic, limited understanding, very poor argument & analysis.

0 Not submitted (null)

Null mark may be given where work has not been submitted, or is in serious breach of assessment criteria/regulations.

Page 40: Writing Good Exam Question

Writing Good Exam Questions

- 38 -

Summative assessment combines these marks into non-integer gradepoint averages (GPAs) in the range 0 to 5, by averaging against relevant weightings. The School does not set any fixed „percentage to gradepoint‟ conversion scheme. Rather, the conversion should be done using a scheme agreed in advance by the relevant Board of Examiners that best fits the particular assignment or question. The approved conversion should appear in the marking pack for each assessment/question for which it is to be used. Table 4 below gives examples of three different percentage-to-gradepoint conversion charts. Table 5 – Examples of grade conversions used at the School.

Example Example Example MARK (%)

Grade point

MARK (%)

Grade point

MARK (%)

Grade point

80-100 5 95-100 5 75-100 5 70-79 4 85-94 4 60-74 4 60-69 3 75-84 3 45-59 3 50-59 2 60-74 2 30-44 2 40-49 1 50-59 1 20-29 1 <40 0 <50 0 <20 0 (typical scheme) (higher numeric

pass threshold) (lower numeric

pass threshold) Students should be made aware of the criteria on which all assessment tasks will be marked, to improve their understanding of the standards expected of them. The criteria used to place students in each grade category must be written down by staff setting assessments, and adhered to by all those involved in the marking.

Page 41: Writing Good Exam Question

Writing Good Exam Questions

- 39 -

10. Ways of producing accurate and clear marking guidance for questions

Experienced question setters recognise the importance of writing exam questions and their associated marking guidance together and seeing them as a whole unit. When you are considering the dimensions of a question have a separate sheet of paper or screen open, in which you can make notes about the expected answers the questions should elicit. Also make note of any likely errors or misunderstandings the students may make. The value of good marking guidelines Well-developed marking guidelines:

Help identify problems with assignments.

Confirm different types of possible responses to the assignment question(s) and what knowledge and/or skills are being tested.

Establish the necessary content for achieving different level marks

Encourage consistency between marking team members.

Provide ideas and wording for constructive feedback to students regarding what would have constituted a good or better response.

Marking guidelines should be based directly on the Assessment criteria and for some modules, such as those that are quantitative in nature, there is probably a need for model/specimen answers, in addition to or instead of marking guidelines.. The Assessment criteria will serve as the basis for the development of the marking guidelines. For each criterion I suggest that you initially think about the major steps in the continuum of student achievement – i.e. what do you expect from a „Pass‟ answer at a 50% grade level and what would you expect of a „Distinction‟ answer? Firstly, for each criterion, consider carefully what you expect students to have written to achieve a passing mark for this criterion. Draft a detailed description of the content and quality that markers should evaluate, in addition to what has been included in the assignment instructions. Ask yourself: “What would comprise the minimum of what I would expect the student to have written for this section, or about this subject, to achieve a passing mark?” This description or set of required concepts/ideas/issues/ definitions will serve as the basis for a grade of „2‟. Once the basic expectations for a „2‟ grade have been drafted in association with the original criteria, it is then necessary to describe what additional level of content and/or quality would achieve higher marks (3, 4, 5). Please draft descriptions of what components might achieve the different possible higher marks (3, 4, 5). (A note on Distance Learning assessments) N.B. Remember that many students do not have access to other facilities, such as libraries, so the student must be able to respond to the question by reference to the study materials ONLY and still achieve a high mark. It should be possible to obtain a „5‟ by original and creative use of nothing more than the materials provided. All instructions should be devised to allow scope for

Page 42: Writing Good Exam Question

Writing Good Exam Questions

- 40 -

imaginative input and cross referencing from students who have access to nothing more than the course materials. For a „5‟ grade in particular, it is original thought, not extra facts, that would contribute.) You may well find that, depending on the nature of your course, module or subject area there is one criteria type that tends to take precedence in differentiating the marks. For example, in a strongly practice-based, professional course, the quality and authenticity of reflective practice may be a priority criterion. In courses concerned with exploring the impact of public policy decisions and practices the lead criteria may be those emphasising the application of key principles and the analysis of outcomes. If there are lead criteria, then a transparent approach would be to emphasise these in advance to students both within the teaching and the assessment design. There should also be links made between the criteria and the intended learning outcomes that help to show students where the emphasis lies. Finally based on the basic criteria for a passing mark („2‟), draft a list of fundamental omissions or errors that would result in a „1‟ ore even in a „0‟, fail.

A question to challenge yourself with - Does a grade of „Outstanding‟ actually equate to „Impossible to achieve‟?

This is particularly important if you are likely to be assessing „essay‟ style questions rather than numeric or quantitative questions. It is possible to score 100% in a calculation answer and virtually impossible to score more than 80% in a discursive essay style answer. You have to give your students the opportunity to be able to excel – you need to consider how your more able students can demonstrate their additional qualities, creativity or more in-depth knowledge or understanding to you. This is often a difficult thing to achieve, i.e. to incorporate into the question design an opportunity to differentiate between your able and excellent students.

Page 43: Writing Good Exam Question

Writing Good Exam Questions

- 41 -

In Summary – Fundamental points in drafting marking guidance

1. Give precise descriptions of what is required for a minimum pass grade (‘2’) This should include details of the key elements or standards the student needs to achieve a passing grade.

2. Provide descriptions of what could be added to the minimum required response that would result in the student achieving a higher grade. (3=good, 4=very good, 5=excellent);

3. Include description of omissions or errors that would define a 1=unsatisfactory (fail) or 0=fail grade;

4. Write description of elements for each question component or part of question set in an assignment. This not only helps markers to recognise what the question-setter expects by way of an answer and how to grade them, but also enables the external examiners to understand why different grades were awarded.

5. Write guidance that allows for some degree of marker’s discretion. Leave room to offer marks for e.g., originality, reading beyond the subject area and creative thinking, good use of examples, clear description of reasoning. This might include points for the integration of concepts or methods from other study module materials.

6. For questions where a numerical mark is to be given, please give a definition of how many marks can be awarded for each question or part of question (if not already designated in assessment criteria). This point system should allow for some discretion for original thought, where relevant. If the numerical answers to a question are sequential (i.e. they use numbers calculated in earlier parts of the question), the marking guidelines should state that marks should be given for process rather than purely the correct answer.

Adapted from “Guidance for Drafting Assignment Instructions and Marking Guidance”, Public Health and Health Service Management Distance Learning Courses

Page 44: Writing Good Exam Question

Writing Good Exam Questions

- 42 -

Here are a couple of examples showing how the marking guidance gives clear links to the grading structure and differentiates between the possible grades.

Example 1. Question Discuss what is meant by the term “epidemic”. Describe the main features of an epidemic curve. Identify the main types of epidemic, giving examples. Marking Guidance (Based on Teaching Session 3 and the Webber book chapter 2) Grade 3 answers should contain the following points ... Epidemic – excess of cases in the community from that normally expected. Characteristics of an epidemic: latent period, incubation period, period of communicability. Common source (point sources/extended source) and propagated source epidemics. Grade 2 answer would have some omissions. Grade 1 answer would have serious errors of interpretation Grade 4/5 should provide a comprehensive discussion of the topic including relevant examples.

Example 2. Question What has been the impact of HIV on the epidemiology and control of TB?

Page 45: Writing Good Exam Question

Writing Good Exam Questions

- 43 -

Marking Guidance (Based on Section 2 Teaching Session 3 and the TB/HIV clinical manual) A Grade 3 answer should provide basic information on the epidemiology and control of TB including –

Infectious agent mycobacterium tuberculosis

Transmission person to person: person with pulmonary TB coughs and produces infectious droplet nuclei

Transmission generally occurs indoors as direct sunlight kills tubercle bacilli

Progression to disease higher in children and people with immunodeficiency.

Main strategy of TB control is to detect and cure cases of pulmonary TB

DOTS strategy

And an indication of how this changes in populations affected by HIV

HIV is driving the TB epidemic in many countries, especially in sub-saharan African and increasingly in Asia and South America.

People infected with HIV at increased risk of developing TB.

Increased proportion of extrapulmonary and smear-negative pulmonary TB cases, which are more difficult to diagnose, account for an increased proportion of total cases.

More adverse drug reactions.

Risk of TB recurrence is higher.

Diagnosis more difficult, especially in children.

Control: TB and HIV/AIDS share mutual concerns. Prevention of HIV should be a priority for TB control, TB care and prevention should be a priority of HIV/AIDS programmes.

HIV exposes any weaknesses in TB control programmes.

Rise in TB suspects puts strain on diagnostic services.

Stigma associated with HIV/AIDS can affect uptake of TB services.

A Grade 2 answer may include some of these points, or alternatively all of these points but with insufficient discussion. Grade 1 and below would include some of these points but with significant errors of interpretation. Grades 4/5 answers will be an intelligent structured discussion of how HIV impacts the epidemiology and control of TB including other relevant points in addition to the ones listed above.

It is interesting to note that in both these examples the assessor has chosen to provide a description for a Grade 3 answer first – describing a point near the middle of the grade-scale, the peak of the normal distribution, before going on to relate higher (4/5) and lower (2/1) scoring grades to this mid-point.

Page 46: Writing Good Exam Question

Writing Good Exam Questions

- 44 -

EXERCISE Consider an examination question that you have written or are currently in the process of drafting. Produce some marking guidance for the question that provides clear descriptions that differentiation between the Grades (0 to 5). Think about which point on the grading scale you find it easiest to begin with.

Quantitative Question Formats and Marking Guidance In section 7.4 I gave an example of a problem-solving formatted question that had a number of sub-sections and itemised tasks within it. Here a set of Marking Guidance for that question (about school children playing in a fountain) is provided as an example of a quantitative question set of guidance. Example 3. Question On a hot summer day, children in three schools had a school outing to a playground where some of the children played in the recreational fountain. Two days later nearly half the children had symptoms of vomiting, diarrhoea, abdominal pain and headache. A retrospective cohort study was carried out to try to identify the source of the outbreak with the following results.

Risk factor Exposed to risk factor

Not exposed to risk factor

Ill Not ill Ill Not ill

Ate commercial ice-cream 72 76 17 22

Drank water from taps near fountain 3 4 78 89

Drank water from taps near sanitary facility

18 32 68 64

Played in fountain 87 80 4 19

Drank from fountain 25 15 24 75

Page 47: Writing Good Exam Question

Writing Good Exam Questions

- 45 -

(a) Define what is meant by the risk and relative risk of becoming ill associated with each factor (10 marks).

(b) Calculate BOTH the risk and relative risk associated with each factor (30 marks).

(c) Suggest possible interpretations of the results, and the implications for control recommendations (10 marks).

Marking Guidance Risk = children who were ill who were exposed/total number exposed Relative risk = risk in exposed/risk in unexposed Give 5 marks each for these definitions: total 10 marks. Give3 marks for each correct risk & 3 marks for each correct relative risk: total 30 marks Main risk factor is playing in the recreational fountain. This suggests that the source of the outbreak is water in the fountain, possibly indicating faecal-oral transmission. Water in the fountain should be tested regularly for relevant bacteria and viruses (eg, E Coli, salmonella, norovirus) and should be monitored to ensure that adequate levels of chlorine are present in the water. Alternatively children could be prevented from playing in the fountain (however, on a hot sunny day it may be difficult to keep them out of the water!) Up to 10 marks for that or similar relevant comment.

The investigators wanted to identify the infectious agent involved. One possibility they considered was norovirus which is known to cause acute gastroenteritis. Although reverse transcription-PCR (RT-PCR) method is considered to be the “gold standard” for diagnosis of this viral infection, it requires skilful personnel and well-equipped laboratory. A simpler diagnostic kit has been developed. The following table shows how the simpler diagnostic kit compares to the gold standard.

Gold standard

Diagnostic test Norovirus present Norovirus absent

Norovirus present 37 3

Norovirus absent 13 47

(d) Would you advise the investigators to use the simpler diagnostic test in their

epidemiological survey? Would your recommendations change if the simpler diagnostic test was to be used in clinical practice. Justify your answer. (50 marks)

Marking Guidance Sensitivity = 37/50 = 74% Specificity = 27/50 = 94%

Page 48: Writing Good Exam Question

Writing Good Exam Questions

- 46 -

Give 5 marks each for calculation of sensitivity and specificity (10 marks). Discussion of whether or not to use the test in (i) epidemiological survey or (ii) clinical setting Up to 40 marks for answers that identify the key requirements of a diagnostic test in the two situations and uses information from the calculation of sensitivity and specifity correctly. Some of the following points may be included in the answer:

Possible implications of missing true cases (1 in 4 true cases will be missed) and of a diagnosis in true negatives (6/100 people without the disease will be “diagnosed” by the test).

(i) Epidemiological survey: simpler diagnostic test will be adequate to identify the outbreak. Do not need to identify all cases to recognise that this is an outbreak. As large numbers to be tested consider cost/resources/time savings of using the simpler test.

(ii) Clinical practice: what are the implications of missing 1 in 4 true cases? As treatment non-specific (rehydration therapy) a false-negative diagnosis with respect to norovirus would be unlikely to affect the outcome of the illness in the individual. However consider whether other investigations may be undertaken in people with symptoms who have tested negative for the disease. Also as norovirus is known to be very infectious consider impact on behaviour of having a diagnosis of the disease. Also identification of contacts. Less issue for time/resources in the clinical setting so it may be better to go for the gold standard test.

Other factors: cost, resources, time (up to 10 marks). 11. Ways of producing model / specimen answers suitable for distribution to students

Providing students with past papers and specimen answers before an examination is one way of providing transparency and clarity in what is expected and valued in an answer. Providing students with a specimen answer after their papers have been marked also helps them to review their own learning and act as a form of „feedback‟. It also helps to tailor any individual feedback to the particular needs of a student (e.g. in tutorials) rather than having to cover everything generically. Some teachers worry that providing a model/specimen answer can have a reductionist impact on learning and in someway act to limit the scope and individuality of students in responding to questions. This may be a valid concern if the answers expected require students to come to their own conclusions and argue a particular perspective rather than processing information which in many ways can be considered factually right or wrong or where there is an acknowledged „best interpretation‟. An assessor needs to consider what can and should be conveyed in a model/specimen answer and this will hugely influence the style of specimen answers that should be provided. For example, where the answers is conveying correct interpretations or knowledge of undisputed facts – specimen answers can take the form of brief, summary note answers that identify key pieces of knowledge or

Page 49: Writing Good Exam Question

Writing Good Exam Questions

- 47 -

explanations of why a particular answer is correct (or more correct than others). However, if answers are expected to „use evidence‟ or „explain with reference to the literature‟, the specimen answer provided should seek to model good practice in these academic skills whilst also emphasising that there may be other ways of achieving positive results. In very open ended response questions it may be best to provide brief outlines for two or three different possible interpretations and arguments presented – this can be particularly useful in a „feedback‟ mode of presentation in which students come, review and then discuss the different approaches taken thus attempting to encourage students in finding their own „voice‟.

You may like to refer to the extended case study provided in Appendix 3 that takes you through the steps of exam question and marking guidance development together with extracts from the module team discussions.

12. Exam question development, validation and approval processes The development and approval of questions is the responsibility of the course team and is usually a process started towards the end of the Autumn term as refining questions and marking guidance does take quite a lot of time to do well and collaboratively. You have an opportunity now, if you wish, to review an extended case study showing the approach adopted by one course team and showing the development process for one question. Please refer to the extended case study provided in Appendix 3 to see the process by which questions are produced by module teams.

This extended case study, based on a real example, aims to show the stages of development that the question went through and reflections on the process made by the course team (shown in comment boxes) The case study includes the following sections 3.1 Question Background 3.2 A Work in Progress (presented in four steps) i) An Early Draft with Feedback (Autumn Term) ii) The Question Amended after Feedback from the Exam Chair (July) iii) Some Fine Tuning (Final Version) iv) A Completed Work? (Some reflections on the use of the question) 3.3 A reflective exercise

Page 50: Writing Good Exam Question

Writing Good Exam Questions

- 48 -

Following the initial grading process, Module Organisers

should look at the distribution of grades for the particular

Module. If this deviates significantly from past performance

or appears to differ significantly from other grade

distributions at Course, Faculty or School level, this should

be considered in more depth – to confirm that the marks given

are indeed in line with School criteria. In some cases, Module

Organisers may wish to recommend re-marking, procedures for

which are detailed in the Guidance Notes for Boards of

Examiners.

LSHTM Assessment Code of Practice

(2012)

13. Security issues

It is important to be aware of information security matters when handling exam questions. Please follow the procedures set for your course carefully.

E.g.:

password protect files,

place in electronic but secure tutor area

hand-deliver rather than send in unsealed envelopes

or at least seal and mark as confidential to the named recipient,

be careful not to leave overhead transparencies on the projector etc.

It is also important to remember that any grade divulged before the final meeting of the Board of Examiners is a provisional grade, subject to external review and may be amended at the discretion of the examiners. Appeals When thinking about the way we write examination questions and conduct summative assessments it is worthwhile remembering that candidates may appeal against a result where there is concern that the examination has not been conducted in accordance with School policies and procedures. However, the University of London does not allow appeals on purely academic grounds, such as challenging the interpretation of a concept or principle.

Page 51: Writing Good Exam Question

Writing Good Exam Questions

- 49 -

14. Providing support and guidance for students (formative assessment /practice opportunities)

Strategies to support students are usually based upon two guiding principles; a) Transparency – students knowing how and when their learning achievements

are going to be judged and evaluated from the outset of their studies; b) Providing opportunities to practice and rehearse the ways in which they will be

required to demonstrate their learning achievements. A common approach used in the School is to provide examples of past papers and examiners reports so that students can see the process of assessment clearly. It is also desirable to provide opportunities for students to experience assessment forms and formats before they „count‟. Building „mock‟ examinations into the module or course and giving students feedback on their approach and success is one way that this can be done. Having formative assessment that mirrors the summative assessment can also be helpful. This is especially true for students at the School who may have had very diverse experiences of education and assessment processes prior to their Masters courses either in London or by DL. The School has produced some guidance on the delivery of feedback to students after formal course work assessments - this particularly highlights the need for clarity, transparency and speed of feedback turn around-time (see below). However, I would also like to emphasise the need to provide constructive feedback on the formative, „practice‟ or mock assessments that are part of the teaching units at The School. Feedback here needs to be focussed on helping the students to „do it better next time‟ – or to coin a phrase “Feed-forward”. It may be helpful to keep in mind that ultimately learning is a transformative process, personal to the individual, that isn‟t confined by or restricted to set points of assessment. Marks provide useful measures and milestones, particularly within formal course structures, but we also want our students to understand that learning is life long, and to develop the skills needed to become sophisticated life-long learners.

EXERCISE How can you provide support for your students as they prepare for and participate in examination assessments?

Page 52: Writing Good Exam Question

Writing Good Exam Questions

- 50 -

Feedback of progress to students

(Assessment Policy 2009)

The system of feedback should be made clear to students before

they undertake their first piece of assessment.

For courses taught face-to-face in London, this may be

set out in course handbooks and should be explained or

restated by the Course Director at an appropriate

point.

For courses delivered by Distance Learning, details of

what students can expect will be made clear in course

and module handbooks or other materials as appropriate.

For Course work components - Students should be given feedback

on their progress within a defined time period, measured in

weeks, during which double-marking takes place, feedback is

written, and grades and feedback are passed to Course

Administrators in the Teaching Support Office or DL Office to

send to students. Feedback will consist of full comments on

the piece of work plus a grade, the former being used to give

informative guidance to the student on progress made.

For courses taught face-to-face in London, the standard

turnaround time for marks to be agreed and feedback

given to the student is within either three weeks of

the deadline for handing in the work in term time, or

the end of the first week of the next term, whichever

is later.

For courses delivered by Distance Learning, turnaround

times for marks to be agreed and feedback given to the

student may be more variable; but clear guidance on the

timeframe within which this should happen will be given

to all marking and administrative staff.

Page 53: Writing Good Exam Question

Writing Good Exam Questions

- 51 -

Feedback on Examinations School policy is that for coursework and project reports, students should receive individual feedback to aid their learning. For the June exams, students receive their grades. For DL courses, Examiner‟s Reports for Students are prepared on expectations with references to marking schemes. Here is an example of an Examiner‟s Report for Students that shows variety in feedback depending on question type. Please note where the Examiner has identified what would be required for a „sound pass‟ and what would be expected of an answer awarded the higher grades.

ID2 Examiner’s report for students Question 1 Overall all the sections of question 1 were well answered by the candidates who attempted this question. The standard of the answers was high and showed depth of understanding. All major points were covered form questions 1a-1e. The points that were expected to be included to gain a good mark are detailed below.

a) Gram stain

Expected: Differential staining technique that distinguishes bacteria based on cell wall structure. Description of stain (crystal violet and iodine) and staining technique (acetone decolourises Gram negative cells). Gram positive stain purple, Gram negative stain pink. Gram stain also allows shape of bacteria to be noted ie round verses rod shaped.

b) Lipopolysaccharide

Expected: Cell wall of Gram-negative bacteria. Structure: Lipid A (toxin activity), conserved core polysaccharide (KDO), O-specific side chain polysaccharides responsible for many serotypes for example in salmonella. Endotoxin. Toxic shock syndrome.

c) Bacillus anthracis

Expected: Gram-positive. Spores. Aerobe. Toxins edema factor/lethal factor, capsule. Anthrax: Lung or skin infection. Zoonosis. Biological warfare. Ciprofloxacin, penicillin.

d) Treponema pallidum

Expected: Spirochete. Motile, Gram-negative. Cannot be cultured. Immunofluorescence test. Syphilis, STD. Three stages, primary, secondary, tertiary. HIV associated.

Page 54: Writing Good Exam Question

Writing Good Exam Questions

- 52 -

e) Corynebacterium diphtheriae

Expected: Gram-positive. Non-motile. Blood tellurite agar (black colonies). Child hood infection of upper respiratory tract. Aerosol. Toxin (tox gene; regulated by low iron concentrations. Vaccination against toxin; penicillin kills bacteria but does not inactivated toxin.

Question 2 For a safe pass, the student should have discussed that N. meningitidis is Gram-negative diplococcus, non-motile and lives in a certain percentage of upper respiratory tracts within the population. They cause meningitis and other diseases / symptoms by crossing the blood brain barrier through the same path, which neutrophils use. As virulence factors, they have pili and fimbriae to attach, endotoxins causing inflammation to help entry, capsules to interfere with complement attack as well as phagocytosis, killing and degradation by macrophages/neutrophils, and IgAase to neutralize IgA. They can be typed be several capsule serotypes, which are not all covered by the available vaccine. Diagnosis needs to be very fast, since the most affected ones are children and teenagers, which can succumb to the disease rather fast. Antibiotic therapy needs to be started quickly. It would have been excellent to name a few relevant antibiotics. For diagnosis, growth test using CF and blood on chocolate / blood agar, and test for sugar usage, and the latex agglutination test should be mentioned, as well as other possible test including PCR. The more details the better the score. Question 3 For a safe pass the student should have named 2 zoonotic infections such as brucellosis, salmonellosis (Salmonella typhimurium), listeriosis (M. bovis), leptospirosis, psittacosis, tularaemia (Francisella), anthrax (Bacillus anthracis), Coxiella (Q fever), Lyme disease and so on, so lots of choices. Better grades could have been achieved by describing their reservoirs, life cycles and diseases in detail as well as how they can be controlled. Some of them have a more complicated life cycle and are transmitted by vectors (Borrelia, Coxiella, Y. pestis); some come from specific hosts (M. bovis from ruminants, Leptospira from rats); B. anthracis makes spores and is therefore difficult to eliminate by simple disinfection, and the cadavers need to be incinerated. If these issues were detailed, students would have scored high marks.

Comment (M1): This is helpful for the

student to know what is a “safe pass”

Comment (M2): This is good

Page 55: Writing Good Exam Question

Writing Good Exam Questions

- 53 -

Question 4. This question is based on the paper in your reader (Bahl et al). The questions help you understand and interpret the data that are given in the tables, and help you follow the discussion of the data by the authors. a) First of all, always read the titles/headers of tables carefully, because these tell you what exactly is presented in the table: what is measured and how, what the numbers mean, etc. The two tables give you different information, table 1 counts episodes, and therefore give you incidence, whereas table 2 gives prevalence, that is „days-with-disease‟ during the observation period. Looking at „risk‟ for diarrhea, we see in table 1 that children with low plasma zinc are „at increased risk‟ because there is a higher incidence of diarrhea with a significantly higher RR (Relative Risk, significant when the confidence interval does not contain 1): 1.47 (1.03, 2.09). There is also a significantly higher risk for severe diarrhea (1.70), but the RR for prolonged diarrhea is not significantly different (RR of 2.54, but confidence interval contains 1). This is further supported by the prevalence data in Table 2, where we see that only the diarrhea with fever (= more severe diarrhea) is significantly more frequent in the children with low plasma zinc. There is no significant difference in the prevalence of the other morbidities between the children with low and with normal plasma zinc (see the P-values in the table). b) First, read carefully. On what data are these statements based? In table 1 you can see that the nr of episodes of ALRI was not different between the groups (Confidence Interval contains 1). However, the total number of days with ALRI, as presented in Table 2, was significantly higher in the children with low plasma zinc. Therefore, one has to conclude that there must have been more days per episode in the children with low plasma zinc.

Comment (M3): This is helpful for

students

Page 56: Writing Good Exam Question

Writing Good Exam Questions

- 54 -

15. Concluding Remarks It is sincerely hoped that this workbook has provided the necessary guidance and information you need to be able to produce demanding but fair examination questions and their associated marking guidance and assessment criteria, for your modules. To conclude here are a few summary points, The process

Start early – the process of innovating, drafting, gaining feedback, re-drafting etc takes time and it is important to begin thinking about examination questions in good time.

Collect all relevant documents together before you start, e.g. the learning outcomes for your module, standard grading schedules and criteria, any past examination papers you have from recent years, any feedback from internal and external examiners etc.

Produce examination questions and marking guidance in parallel, at the same time – thinking about what you expect from your students will help you to clarify your questions.

Think about who can help you „fine-tune‟ your draft questions –colleagues can see things with fresh eyes and avoid ambiguity they can also give you feedback on how long the question will take the students to answer.

The examination questions

Questions that are divided into discrete sub-sections and are accompanied by their associated marking schedules have many benefits for both students and markers – providing clarity in presentation and grading reliability.

Include data or information in the question to reduce the emphasis on memory and increase the emphasis on application and critical thinking.

Check that your draft question does not favour or disadvantage students from particular backgrounds or cultures.

Keep sentences short, layout clear and well spaced out and use precise and unambiguous language.

Check that the question standard and assessment criteria are at Masters level.

Check – does the question enable students to excel and allow markers to discriminate between able and excellent performances.

Page 57: Writing Good Exam Question

Writing Good Exam Questions

- 55 -

The marking guidance

Marking guidance can take many forms – it could be a model/specimen answer or a list of elements that should or could be included in an answer or be a worked calculation etc

However, marking guidance should go beyond lists of content or topics and should also address format, analytical level, originality etc

Guidance should include a marking scheme that discriminates between possible grades and is clear to all co-markers.

Other forms of assessment This workbook has focussed on the challenge of writing exam questions. However, many of the principles and good practices highlighted are equally applicable to the design of in-course assessments such as assignments, reports and projects. Where there is more than one assessment task for a module or course it is important to ensure that certain learning outcomes are not over assessed whilst others are neglected.

Page 58: Writing Good Exam Question

Writing Good Exam Questions

- 56 -

Further Reading Suggestions

Bloom, B. S., Krathwohl, D. R., and Masia, B. B. (1956). Taxonomy of Educational Objectives: The Classification of Educational Goals, New York, NY: D. McKay. Brown, G., Bull, J. and Pendlebury, M. (1997) Assessing student learning in higher education Routledge Crowe, A., Dirks, C. and Wenderoth,M.P. (2008) Biology in Bloom: Implementing Bloom‟s Taxonomy to Enhance Student Learning in Biology. CBE Life Sci Educ 7(4): 368-381 2008 American Society for Cell Biology. Available at http://www.lifescied.org/cgi/content/full/7/4/368 Coutinho, S. A. (2007). The relationship between goals, metacognition, and academic success. Educate 7, 39–47 Entwistle, A., and Entwistle, N. (1992). Experiences of understanding in revising for degree examinations. Learn. Instruct. 2, 1–22. Haines, C. (2004) Assessing Students’ Written Work: Marking essays and reports Key guides for effective teaching in higher education RoutledgeFalmer McMillan, J.H. (2001) Classroom assessment :Principles and practice for effective instruction. Boston:Allyn and Bacon Pass-it Good Practice Guide http://www.pass-it.org.uk/resources/031112-goodpracticeguide-hw.pdf. Piontek, M.E. (2008) Best Practices for Designing and Grading Exams. Centre for Research on Learning and Teaching, Occasional Papers no. 24, University of Michigan, http://www.crlt.umich.edu/publinks/occasional.php

Page 59: Writing Good Exam Question

Writing Good Exam Questions

- 57 -

Useful web-sites

Center for Instructional Development and Research Resources – Writing Exam Questions A collected set of web-links and guidance sites on writing exam questions. Most from institutions in the USA. Lots of information on writing MCQ Questions and comparing them with other forms of written assessments. http://depts.washington.edu/cidrweb/resources/exams.html

Key School documents used as reference material in this Workbook (and where to find them) LSHTM Assessment Code of Practice (January 2012)

http://www.lshtm.ac.uk/edu/taughtcourses/staffresources/index.html

Guidelines for writing exam questions

http://www.lshtm.ac.uk/edu/taughtcourses/exams_assmt_staff/index.html Assessment Irregularities Procedures

http://www.lshtm.ac.uk/edu/taughtcourses/handbooks_regs_pols/index.html MSc Marking Scheme

http://intra.lshtm.ac.uk/registry/regulations/taught_regulations/index.html

Page 60: Writing Good Exam Question

Writing Good Exam Questions

- 58 -

Appendices Appendix 1. Feedback comments are inserted in bold below each question.

EXERCISE For the questions given below - Underline the verb and key elements of the question that give an indication of the extent (limits and boundaries) of the question. Do you feel these are appropriate for Masters level study? 1. Describe the three main methods of economic evaluation (40%). What are the main strengths and weaknesses of each method? (40%). Support your answer with examples of disease evaluation (20%) ‘Describing’ is a relatively low level cognitive skill but then the student is asked to evaluate the three methods by giving strengths and weaknesses – this is the Masters level task in this question. Factors that give limits are the requirement to describe ‘three’ methods and to support the answer with examples. 2. A recent retrospective analysis of health records in the Gambia has suggested that the incidence of malaria has fallen dramatically in that country over the last 10 years. The elimination of the disease is beginning to be discussed. The National Malaria Control Programme has begun a surveillance system to detect future changes. What advice would you give the National Malaria Control Programme on how to organize a surveillance system for malaria. Give practical tips for ensuring its quality. Giving ‘Advice’ requires the students to select from and apply their knowledge in order to synthesise an appropriate surveillance system – this is Masters level Students are also asked to consider what makes a such a system ‘Quality’ – this could be considered a further degree of difficulty. The limits in this question are given by the scenario of the question which makes it specific to a country and a disease context.

Page 61: Writing Good Exam Question

Writing Good Exam Questions

- 59 -

3. Write short notes on THREE of the following. In each case explain the importance of the infectious agent and the mode of transmission in its spread and control. a) rotavirus diarrhoea b) measles c) guinea worm d) dengue e) tuberculosis This question does not clearly articulate Masters level requirements as the ‘Write short notes’ does not indicate a level and the ‘Explain the importance’ may or may not require some level of evaluation and critique but could equally be a measure of memory depending on what had been taught in the module.

Page 62: Writing Good Exam Question

Writing Good Exam Questions

- 60 -

Appendix 2. A detailed example –

re-writing and formatting a question to ease interpretation (related to chapter 6.)

This example has kindly been provided by the teaching team responsible for one of the DL programmes delivered by the School – Fundamentals of Clinical Trials. It shows clearly the way a set of guiding principles are used to mould a clearer question context from a „great idea‟ to a very demanding but fair question set-up. The team wanted to write a question that tested their students‟ abilities to think about and apply key concepts rather than re-work the study materials provided. Past experience had underlined the importance of providing relevant and realistic question contexts and considerable effort is made to vary the scenarios used in question setting. What is presented here is the first draft of the question – some team discussion notes and then the final question as it was used to asses the DL students.

a An Early Draft Of Question Y Context

(including feedback notes from members of the module team presented in boxes)

One of the concerns in the treatment of babies born to HIV+ mothers in developing countries is the transmission of HIV from mother-to-child during breast-feeding. Infant formula, if used safely and consistently can prevent HIV from passing from mother to child but can result in increased infant mortality. In Botswana, free formula is provided and recommended for babies born to HIV-infected-mothers who are able to safely and consistently formula feed their infants. Despite the availability of milk power, a number of HIV-infected-mothers continue to choose breastfeeding. This may be due in part to difficulties in consistently preparing safe-formula feeds and also the stigma associated with being seen to formula feed. For these HIV-infected-mothers exclusive breastfeeding is recommended with early weaning. To investigate potential strategies to prevent mother-to-child transmission of HIV the following randomized control trial (RCT) is planned:

Recruitment Criteria: Pregnant HIV infected mothers, who are currently not receiving antiretrovirals (ART) and who plan to breast feed

Randomisation: Women to be randomized into 2 groups.

Comment (D1): I think we can make

each comment a bit snappier using the

headings of a protocol document?– e.g.

recruitment criteria, exclusion criteria

Page 63: Writing Good Exam Question

Writing Good Exam Questions

- 61 -

? Both groups of women receive antiretroviral as per current standard of care during pregnancy to reduce the risk of HIV passing from the mother to the child, and their infant takes 1 month of prophylactic antiretrovirals following delivery.

Interventions for comparison: o Group One: mothers discontinue ARVs after delivery (unless ARV needed for

their own health) o Group Two: mothers continue ARVs for 6 months after delivery,

The primary endpoint: the proportion of babies alive and uninfected with HIV by 7 months of age.

Secondary endpoints: include cumulative HIV free survival at 7 and 18 months and safety of maternal ARV prophylaxis for HIV exposed infants.

Teaching Team comments: We were worried about this being a more difficult context to grasp (i.e the intervention is given to the mother but that the impact of the intervention is on the infant, the primary outcome that we are monitoring is HIV survival in the infant) for those whose first language is not English and those who do not have specialist knowledge of HIV. However, we felt that this context was in keeping with the level of understanding we would expect from students enrolled on the module. It is also reflective of a study based in a developing country. Thus we felt with re-presentation of the context, using a table form, it would be easier to understand.

b. Suggested Improvements to Question Y Context

HIV can be transmitted from mother to child during pregnancy, during delivery, and after delivery, through exposure to HIV via breast milk. Prevention of mother-to-child transmission (PMTCT) of HIV therefore focuses on interventions that reduce the risk at each of these times. A research team wishes to determine the efficacy and safety of adding maternal antiretroviral prophylaxis during breastfeeding to the current local standard of care, for PMTCT of HIV. The current standard of care includes antiretroviral prophylaxis for the pregnant mother and one-month of prophylaxis for the infant after birth. Having identified their primary outcome as infant HIV-free survival at 7 months, they plan the following randomised control trial (RCT):

Comment (D2): Don’t know what to

call this? This is ethically lead as well as

base line driven???

Comment (D3): I brought this one up

from down below!!!

Comment (D4): Lets talk over the

phone because I still think we have some

way to go with setting up this context.

We need to make it as clear as possible

for the students and this is a complicated

trial!

Page 64: Writing Good Exam Question

Writing Good Exam Questions

- 62 -

Table 1: Proposed post randomisation* treatment of the intervention and

control *Randomisation: Pregnant women are randomised 1:1 to the intervention or control arm

During pregnancy and labour/delivery

After Delivery/During Breastfeeding

Maternal Standard of Care

Maternal Intervention Infant Standard of Care

Intervention Arm

Maternal antiretroviral prophylaxis from 28 weeks gestation

6 months maternal antiretroviral prophylaxis

1 month infant prophylaxis

Control Arm Maternal antiretroviral prophylaxis from 28 weeks gestation

Usual care ie without 6 months maternal antiretroviral prophylaxis

1 month infant prophylaxis

Page 65: Writing Good Exam Question

Writing Good Exam Questions

- 63 -

Appendix 3 Extended Case Study showing the Development of a real Exam Question

- CT101 Fundamentals of Clinical Trials This extended case study, based on a real example, aims to show the stages of development that the question went through and reflections on the process made by the course team (shown in comment boxes) The case study includes the following sections 3.1 Question Background 3.2 A Work in Progress (presented in four steps) i) An Early Draft with Feedback (Autumn Term) ii) The Question Amended after Feedback from the Exam Chair (July) iii) Some Fine Tuning (Final Version) iv) A Completed Work? (Some reflections on the use of the question) 3.3 A reflective exercise

3.1 Question Background

Question Motivation: We wanted to move away from the overused cardiology drug trial examples of previous exam papers. We have a diverse tutor team that included clinical trialists working at the Institute of Mental Health and we were inspired by a BMJ article by Goodyer et al reporting on a Mental Health trial on major depression in adolescents. Question Context: A Mental Health Trials (major Depression) taking place in Adolescents. The intervention of interest was Cognitive Behavioural Therapy (CBT) as an add-on to the standard drug treatment, selective serotonin reuptake inhibitors (SSRIs). The outcome measurements were determined by a mental health questionnaire. Overall Objective: To see if students could apply key fundamental principles of clinical trials to:

a unique patient base (i.e. adolescents)

a non-drug intervention (i.e. CBT)

an outcome that is measured by questionnaire (rather than by clinical measurements).

Page 66: Writing Good Exam Question

Writing Good Exam Questions

- 64 -

Assessment Needs: The exam was composed of two questions. Prior to question setting, we identified and allocated the key concepts (as covered in the distance learning study material for this module) to be tested for each question. For this question the chosen key principles to test were:

Trial designs;

Recruitment; Blinding;

Randomisation;

Bias The second question was to be much more numerical/statistical in nature, thus this first question excluded calculation type questions. Question two also included questions specifically designed to be “grade differentiators”. The first question was seen as testing students‟ understanding and application of central and “straightforward” concepts. Question Types: We aimed to include a range of question types. Who was involved in question development?: Three key tutors, course director(s), the external examiner and exam chair. So there was lots of input from a number of experts, many drafts, and a long communication trail before agreeing a final version. What follows is a tracking on this process

3.2 A Work In Progress (Four steps i, ii, iii and iv)

Step i An Early Draft with Feedback (Autumn Term)

The question is given in plain text, marking guidance is indented and italics and module team‟s discussion notes are the comments boxes alongside the text.. The Question Selective serotonin reuptake inhibitors (SSRIs) are prescribed for the treatment of major depression in adolescents (age 11-16), although there are concerns regarding their usefulness and a raised risk of suicide. The

National Institute for Health and Clinical Excellence (NICE) recommends the use of SSRIs in combination with Cognitive Behavioural Therapy (CBT) in the UK. This recommendation is based on data collected from the United States.

Comment (A1): What sort of depression

– major? Should we define with a

depression score?

Comment (A2): Should we define

adolescents? 11-16? Not certain what is

the standard.

Page 67: Writing Good Exam Question

Writing Good Exam Questions

- 65 -

Investigators plan to conduct a pragmatic (effectiveness) randomised controlled trial of SSRIs versus SSRIs with cognitive behavioural therapy (1 therapy session per week) in the UK in adolescents with major depression.

a) Why do you think the investigators plan to conduct a pragmatic trial as opposed to anexplanatory (efficacy) trial?

(10 marks?)

The efficacy of an intervention is the benefit it achieves under ideal conditions.The effectiveness of an intervention is the benefit it achieves through routine clinical practice. In this case research is aimed at inform standard practice in the UK. Thus it is important to compare how each intervention performs at the GP/community level in order to inform a policy decision.

c) What design and conduct features would you apply in order for the trial to be explanatory or pragmatic? (10 marks)

Think about the eligibility criteria and how restrictive this should be Think about who will be delivering the therapy intervention, what training and experience these people would have.

Comment (A4): Will need to return to

marks to check them later.

Comment (A5): Need to include here

why this is good. Then what is the

weakness of this approach.

Comment (A6): There is lots of good

discussion on blackboard. Check bb to

pad out model answer.

Comment (A7): I’m not certain I would

know how to answer this question.

Design and conduct in one question

overwhelms me – sorry I’m just a babe

in arms really! I’m guessing your

direction is to think about an intention to

treat analysis and how we define a

protocol deviation. If we continue down

the pragmatic route do you think we

could streamline this question?

Comment (A3): I know generally it is

good to split questions out, but I think

here we could give more information in

the scenario then ask the student to

explain. My thinking is that depending

on the objectives an efficacy trial might

be of interest. So it really depends what

your end game is.

Page 68: Writing Good Exam Question

Writing Good Exam Questions

- 66 -

How strict do you want to be in ensuring that the therapy is being delivered according to the manual? How much effort should be in place to minimise deviations from the protocol such as treatment withdrawals, compliance to medication and therapy.

d) Please discuss three potential barriers to recruitment (including consent) and retainment of adolescents on this trial. (15 marks)

Parents may not want to enter their children into a trial involving an SSRI because of the risk of suicide, especially as they have major depression. Therefore recruitment may be slow. As NICE recommends SSRIs in conjunction with CBT based on US data there may not be equipoise for this trial. Therefore clinicians may be unwilling to randomise highly depressed children and their parents may also be unwilling to be involved. Both treatments would be available outside of the trial and therefore there is not as much incentive to take part in a clinical trial. The population may include those younger than 16 and therefore specific consent and assent procedures would need to be put in place.There may be a larger drop out in the CBT arm due to the extra burden of having to attend numerous therapy sessions. Alternatively the extra attention may be beneficial and increase retainment.

e) At the design stage a third treatment arm was suggested for inclusion consisting of placebo only. What would be the advantages and disadvantages of including this treatment arm?

(4 marks)

The disadvantages would be that it would be ethically unacceptable to include a placebo treatment for this group of participants. Including a third arm would increase the numbers needed to be recruited. The inclusion of a placebo arm would allow a direct evaluation of treatment against no treatment.

Comment (A8): Hi – have reworded

here but not necessarily for the better. Is

three ok? Should we be overt about

consent? Will need to make marking

more friendly to three points eg 15

marks.

Comment (A9): Like the idea of this

question because you have to think

about it. But not certain whether it is a

step too far for the students. I think

maybe we could drop and ask about

randomisation? I think we have logged

about 10 marks.

Page 69: Writing Good Exam Question

Writing Good Exam Questions

- 67 -

The primary outcome of the trial was the Health and Nation Outcome scale which is a 12 item scale covering a wide range of health and social domains such as psychiatric symptoms, physical health, functioning, relationships and housing. Each question is marked from 0 (no problem) to 4 (severe problem). This was completed by an interviewer at 12 weeks post randomisation. Two hundred adolescents were to be recruited into the trial from six centres. Simple Randomisation was used to allocate treatment in the ratio 1:1. Each centre had one interviewer collecting data and several therapists giving CBT.

f) Discuss whether blinding of the participant, interviewer and therapist would be possible in this trial. (10 marks)

The participant could not be blinded as they would know whether they were receiving treatment or not. The interviewer could try to remain blind to allocation, but likely that the patient would mention what they had been receiving. The therapist would not be blind to treatment allocation

g) Identify and discuss possible sources of bias that could occur in each of the design, conduct and analysis stages of this trial. (15 marks)

Bias is defined as systematic distortion of the estimated intervention effect away from the truth caused by inadequacies in the design, conduct or analysis of a trial. The outcome measure is subjective, different interviewers could be rating people very differently. It is not possible to blind participants so what they may report may be dependent on their treatment group. More experienced therapists could always treat the more severely depressed participants We could end up with baseline imbalance for important variables such as centre, sex, severity of depression as these have not been taken account of in the randomisation scheme and due to relatively small numbers we could expect some imbalance when using simple randomisation. Need to ensure that no selection bias has taken place.

New Question on Randomisation: What are the advantages and disadvantages of using simple randomisation in this trial? Easy verus inbalance of sample size no. Might recommend randomised permuted blocks?

Comment (A10): Lovely. Should we be

adding anything about a composite score

and how we use that to conclude? Or is

this too much info? (I.e. what is

considered as an improvement? I think

this information can come later)

Page 70: Writing Good Exam Question

Writing Good Exam Questions

- 68 -

The primary analysis of this trial showed that at 12 weeks post randomisation the mean (standard deviation) of the primary outcome was 18 (CI 7.5) in the SSRI group and 17.1 (CI8.3) in the SSRI plus CBT group. The difference between the two groups was not statistically significant under an intention-to-treat analysis.

h) What can you conclude from this result? j) What other information would you consider when interpreting the results of this trial, think particularly about what may be reported in the publication? (20 marks)

Is the sample size large enough? How many people were included in the analysis? Was this the best and most appropriate design? Has there been substantial bias introduced? What were the results of the secondary outcomes, in particular safety? Are the conclusions similar under a per protocol analysis? Was the randomisation successful, i.e. are the treatment groups balanced? Is the trial population generalisable to inform policy decisions?

Step ii The Question Amended after Feedback from the Exam Chair (July)

(Ready for Review By The External Examiner) The Question Re-worked Selective serotonin reuptake inhibitors (SSRIs) are prescribed for the treatment of major depression in adolescents (age 11-16), although there are concerns regarding their usefulness and a possible raised risk of suicide. The National Institute for Health and Clinical Excellence (NICE) recommends that the National Health Service (NHS) in the UK use SSRIs (oral

drug) in combination with Cognitive Behavioural Therapy (CBT). This recommendation is based on data collected from the United States, which could limit the generalisability to depressed adolescents in the NHS. Investigators in the UK conducted a pragmatic (effectiveness) randomised controlled trial of SSRIs alone versus SSRIs with cognitive behavioural therapy (1 therapy session per week) in adolescents with major depression. Participants were treated for 12 weeks.

Comment (A11): Am I right in

remembering this as the confidence

interval?

Comment (L1): JR – Condense to

what’s needed to answer the question.

Comment (R1): EL – Yes, but we also

need to be careful not to disadvantage

those who know nothing in this area.

Comment (L2): Can anyone see how to

edit this further?

Page 71: Writing Good Exam Question

Writing Good Exam Questions

- 69 -

b) Explain what is meant by a “pragmatic” randomized control trial. Why do you think these investigators conducted a pragmatic trial rather than an explanatory trial? (Hint: You will need to consider the advantages and disadvantages of each approach.) (12 marks)

Pragmatic trials A pragmatic trial determines the effectiveness of an intervention. This is the benefit it achieves through routine clinical practice

Advantages: They are generalisable to routine clinical practice

Disadvantages: Larger sample sizes are often required Explanatory trials An explanatory trial determines the efficacy of an intervention. This is the benefit it achieves under ideal conditions.

Advantages: Variation can be reduced due to the strict procedures and so inferences can be made from smaller sample sizes

You can determine whether the intervention actually works

Disadvantages: They are conducted under strict procedures and so not very generalisable to routine clinical practice. In this case research is aimed at inform standard practice in the UK. Thus it is important to compare how each intervention performs at the GP/community level in order to inform a policy decision. Compliance may be an issue in this age group.

b) Given this trial is conducted in adolescents, discuss the potential challenges to

recruitment and retainment of study participants. (6 marks)

Parents may not want to enter their children into a trial involving an SSRI because of the risk of suicide, especially as they have major depression. Therefore recruitment may be slow.

As NICE recommends SSRIs in conjunction with CBT based on US data there may not be equipoise for this trial. Therefore clinicians may be unwilling to randomise highly depressed children and their parents may also be unwilling to be involved.

The two previous issues (for some too high risk, for others effectiveness already proven) could be described as a problem of equipoise that could affect recruitment.

Both treatments would be available outside of the trial and therefore there is not as much incentive to take part in a clinical trial.

The population may include those younger than 16 and therefore consent procedures for non adults is more complicated and challenging

There may be a larger drop out in the CBT arm due to the extra burden of having to attend numerous therapy sessions. Alternatively the extra attention may be beneficial and increase retainment.

Page 72: Writing Good Exam Question

Writing Good Exam Questions

- 70 -

They should get some point for mentioning that because this is a pragmatic trial the drop out will reflect the normal situation as what they are evaluating is a policy of recommending CBT it will not bias the research question

Adolescents may drop out when they leave school Two hundred adolescents were to be randomly assigned into the trial from six centres. Each centre had one interviewer collecting data and several therapists giving CBT. The primary outcome of the trial was the total score of the Health and Nation Outcome scale which is a 12 item scale covering a wide range of health and social domains such as psychiatric symptoms, physical health, functioning, relationships and housing. Each question is marked from 0 (no problem) to 4 (severe problem). This was completed by an interviewer at 12 weeks post randomisation. The primary analysis compared the average total score of the Health and Nation Outcome scale at 12 weeks post randomisation between treatment groups (SSRI alone versus. SSRI+CBT). c) Explain the term “double-blind” in the context of a randomized control trial.

Discuss whether double-blinding would be possible in this trial. (8 marks)

The term “double-blind” means that neither the participant nor the person treating the patient (i.e. doctor and/or/both therapist), nor the person responsible for evaluating the outcome (the researcher) know whether the patient has been allocated to treatment or not. In this case we also have the interviewer to think about too, who may or may not be the evaluator.

The participant could not be blinded as they would know whether they were receiving treatment or not

The interviewer could try to remain blind to allocation, but likely that the patient would mention what they had been receiving

The therapist would not be blind to treatment allocation

It might be possible to blind the analyst/data processing team. d) Identify and discuss three possible sources of bias that could occur in this trial. (6 marks)

5 marks for each well explained challenge up to a maximum of 15

The outcome measure is subjective, different interviewers could be rating people very differently.

It is not possible to blind participants so what they may report may be dependent on their treatment group.

More experienced therapists could always treat the more severely depressed participants

Page 73: Writing Good Exam Question

Writing Good Exam Questions

- 71 -

We could end up with baseline imbalance for important variables such as centre, sex, severity of depression as these have not been taken account of in the randomisation scheme and due to relatively small numbers we could expect some imbalance when using simple randomisation.

Need to ensure that no selection bias has taken place.

Inclusion/exclusion criteria

Timing of collection of outcomes

Experience of interviewers/therapists

Number of CBT sessions needed.

Concomitant medications

Therapies previously received

The process of implementing randomisation should be clear e) In this trial, the researchers used simple randomization to allocate patients to

either SSRI alone or SSRI in conjunction with Cognitive Behavioral Therapy. Explain what is meant by simple randomization. Discuss it‟s limitations and describe an alternative method that might be appropriate for this trial. (12 marks)

Simple randomisation is the equivalent to tossing an unbiased coin. For example, heads may mean allocation to one arm and tails to the other arm. Using randomisation lists prepared by this method ensures each patient has an equal chance of either treatment plan. Randomisation lists are most commonly generated by computer. However, given that this trials is of limited size, there is the possibility that substantial imbalance might occur between the two arms.

In order to ensure similar numbers in the treatment groups throughout the trial, some form of restricted randomisation is usually employed. The most common form of this is with the use of random permuted blocks (block sizes of 4,6,8,10 etc). The idea behind random permuted blocks is that the number of patients allocated to each arm of the trial is the same at certain points in the recruitment process. For example if a block size of 10 is used, then for each 'block' of 10 patients, five will be allocated to each treatment with the sequence of treatment allocations within each block ordered randomly with no connection between assignments.

The problem of predictability increases as the block size decreases. For example, a block size of two would mean every other treatment could be predicted. It is also advisable not to inform the investigators that blocking was used and in particular what size the block is. However, it is relatively straightforward for an investigator to work out the block size after a number of patients have been randomised.

Page 74: Writing Good Exam Question

Writing Good Exam Questions

- 72 -

The primary analysis of this trial showed that at 12 weeks post randomisation the mean (standard deviation) of the primary outcome was 18 (SD=7.5) in the SSRI group and 17.1 (SD=8.3) in the SSRI plus CBT group. The difference between the two groups was not statistically significant under an intention-to-treat analysis.

f) What other information would you consider important to report in the publication of this trial to be able to interpret its results?

(6 marks)

Is the sample size large enough?

How many people were included in the analysis?

Was this the best and most appropriate design?

Has there been substantial bias introduced?

What were the results of the secondary outcomes, in particular safety?

Are the conclusions similar under a per protocol analysis?

Was the randomisation successful, i.e. are the treatment groups balanced?

Is the trial population generalisable to inform policy decisions?

Step iii Some Fine Tuning Final Version of the Question Selective serotonin reuptake inhibitors (SSRIs) are prescribed for the treatment of major depression in adolescents (age 11-16), although there are concerns regarding their usefulness and a possible raised risk of suicide. In the UK, SSRIs are recommended in combination with standard Cognitive Behavioural Therapy (CBT). This recommendation is based on data from the United States, which could limit applicability for practice in the UK. Investigators in the UK therefore conducted a pragmatic randomised controlled trial of SSRIs alone versus SSRIs with a 12 week course of CBT in adolescents with major depression. a) Explain what is meant by a “pragmatic” randomised controlled trial. Why do

you think these investigators conducted a pragmatic trial rather than an explanatory trial? (Hint: You will need to consider the advantages and

Comment (L3): JR – This is too broad

and vague a question. Change to

something more specific.

Comment (R2): EL Tom/Luke – any

ideas to make this more specific?

Comment (L4): JR – But why wasn’t

baseline score used and ANCOVA

adjusted for baseline done?

Page 75: Writing Good Exam Question

Writing Good Exam Questions

- 73 -

disadvantages of each approach.) (12 marks)

Pragmatic trials A pragmatic trial determines the effectiveness of an intervention. This is the benefit it achieves through routine clinical practice

Advantages: They are generalisable to routine clinical practice

Disadvantages: Larger sample sizes are often required Explanatory trials An explanatory trial determines the efficacy of an intervention. This is the benefit it achieves under ideal conditions.

Advantages: Variation can be reduced due to the strict procedures and so inferences can be made from smaller sample sizes

You can determine whether the intervention could actually work

Disadvantages: They are conducted under strict procedures and so not very generalisable to routine clinical practice. There is already efficacy information but in this case research is aimed at informing practice in the UK. Thus it is more important to conduct a pragmatic trial set within UK practice.

b) Given this trial is conducted in adolescents, discuss two potential challenges to

recruitment of study participants. (6 marks)

Parents may not want to enter their children into a trial involving an SSRI because of the risk of suicide, especially as they have major depression. Therefore recruitment may be slow.

As UK recommends SSRIs in conjunction with CBT based on US data there may not be equipoise for this trial. Therefore clinicians may be unwilling to randomise highly depressed children and their parents may also be unwilling to be involved.

The population may include those younger than 16 and therefore consent procedures for non adults is more complicated and challenging

Adolescents may drop out when they leave school Two hundred adolescents were to be randomly assigned into the trial from six centres. The primary outcome of the trial was the total score of the 12 item Health and Nation Outcome scale covering psychiatric symptoms, physical health, relationships and housing. This was completed by an interviewer at 12 weeks post randomisation. c) Explain the term “double-blind” in the context of a randomised controlled trial.

Discuss whether double-blinding would be possible in this trial. (9 marks)

The term “double-blind” means that neither the participant nor the person treating the patient (i.e. doctor and/or/both therapist), nor the

Page 76: Writing Good Exam Question

Writing Good Exam Questions

- 74 -

person responsible for evaluating the outcome (the researcher) know whether the patient has been allocated to treatment or not. In this case we also have the interviewer to think about too, who may or may not be the evaluator.

The participant could not be blinded as they would know whether they were receiving treatment or not

The interviewer could try to remain blind to allocation, but likely that the patient would mention what they had been receiving

The therapist would not be blind to treatment allocation

It might be possible to blind the analyst/data processing team. d) Identify and discuss three possible sources of bias that could occur in this trial. (9 marks)

The outcome measure is subjective, different interviewers could be rating people very differently.

It is not possible to blind participants so what they may report may be dependent on their treatment group.

Timing of collection of outcomes might differ between groups because of their different experiences in the trial and possibly different retention rates

Concomitant medications

as can’t be blind post-randomisation, possible that people recruiting might begin to see a pattern and guess what next patient would be allocated to

Disappointment bias if some people wanted CBT and didn’t get it e) In this trial, the researchers used simple randomisation to allocate patients to

either SSRI alone or SSRI with CBT. Explain what is meant by simple randomisation. Discuss its limitations and describe one alternative method that might be appropriate for this trial.

(14 marks)

Simple randomisation is the equivalent to tossing an unbiased coin. For example, heads may mean allocation to one arm and tails to the other arm. Using randomisation lists prepared by this method ensures each patient has an equal chance of either treatment plan. Randomisation lists are most commonly generated by computer.

Main limitations are possible imbalances between the two arms either in terms of very different numbers between the two groups or in terms imbalances in key prognostic factors between the groups. Both limitations likely given this trial’s planned size.

In order to ensure similar numbers in the treatment groups throughout the trial, some form of restricted randomisation is usually employed. The most common form of this is with the use of random

Page 77: Writing Good Exam Question

Writing Good Exam Questions

- 75 -

permuted blocks (block sizes of 4,6,8,10 etc). The idea behind random permuted blocks is that the number of patients allocated to each arm of the trial is the same at certain points in the recruitment process. For example if a block size of 10 is used, then for each 'block' of 10 patients, five will be allocated to each treatment with the sequence of treatment allocations within each block ordered randomly with no connection between assignments.

The problem of predictability increases as the block size decreases. For example, a block size of two would mean every other treatment could be predicted. It is also advisable not to inform the investigators that blocking was used and in particular what size the block is. However, it is relatively straightforward for an investigator to work out the block size after a number of patients have been randomised.

To deal with imbalances in key prognostic factors between the groups may want to stratify by these factors. Centre is often such a factor, or severity of depression, or age group, or sex. If too many, may prefer to use minimization which is a dynamic process dealing with both types of imbalances together, and appropriate for UK settings where computers easily available.

Step iv A Completed Work? (Some reflection on the use of the Question)

No! We still needed lots more work on the model answer to make it much more specific for the exam marking phase. We also didn‟t like marking it out of 50 – much easier to allocate marks to 100 (but the 50 was a constraint placed on us by the previous exam board)

Was it successful? Feedback from students: o They liked the context – it was interesting o The questions overall were clear and fair covering key concepts – thus

they had a good opportunity to perform well o Students need to think and apply concepts rather than recapitulating

study materials.

There is always room for more tinkering!

Page 78: Writing Good Exam Question

Writing Good Exam Questions

- 76 -

3.3 A Reflective Exercise

EXERCISE Please consider and make short notes on the following - The Process

When do you begin developing examination questions in your course team?

What are the strengths and weaknesses for you in adopting a similar question development approach to the one described in the case study above?

Having read this case study – what elements would you like to transfer to your own approach to question writing?

The Question

How would you rate the above in terms of clarity, authenticity and fairness?

How strong would you expect the inter-marker reliability to be based on the marking guidelines provided?”