Understanding differential attainment across medical … typically using large data sets to ex amine...

89
Understanding differential attainment across medical training pathways: A rapid review of the literature Final report prepared for The General Medical Council Dr Sam Regan de Bere, Dr Suzanne Nunn, Dr Mona Nasser 21/08/2015

Transcript of Understanding differential attainment across medical … typically using large data sets to ex amine...

Understanding differential attainment across medical training pathways: A rapid

review of the literature Final report prepared for The General

Medical Council

Dr Sam Regan de Bere, Dr Suzanne Nunn, Dr Mona Nasser

21/08/2015

1

Funded by the General Medical Council.

The views expressed in this report are those of the participants and the authors and do not

necessarily reflect those of the General Medical Council.

2

1 Contents Table of Figures .......................................................................................................................... 4

Table of Abbreviations and Acronyms ....................................................................................... 4

Executive Summary .................................................................................................................... 5

Introduction ............................................................................................................................... 5

Research design ................................................................................................................. 5

Analysis ............................................................................... Error! Bookmark not defined.

Current narratives of differential attainment ................................................................ 10

1. Introduction ..................................................................................................................... 14

2 Background ...................................................................................................................... 15

3 Aims and purposes of the review .................................................................................... 15

4 Methodology .................................................................................................................... 16

4.1 Rapid review .............................................................................................................. 16

4.2 Narrative synthesis .................................................................................................... 16

5 Methods ........................................................................................................................... 17

5.1 Development and registration of protocol ............................................................... 17

5.2 Search strategy .......................................................................................................... 17

5.3 Data management and extraction ............................................................................ 20

5.4 Quality assurance ...................................................................................................... 20

6 Research Ethics ................................................................................................................ 22

7 Data Analysis .................................................................................................................... 22

8 Narrative synthesis .......................................................................................................... 25

9 Findings ............................................................................................................................ 26

9.1 The Individual or discrete group ............................................................................... 26

Study habits ..................................................................................................................... 27

Psycho-social .................................................................................................................... 27

3

Social and cultural capital ................................................................................................ 28

Success ............................................................................................................................. 28

Ethnicity ........................................................................................................................... 29

IMG ................................................................................................................................... 31

Language .......................................................................................................................... 34

Gender ............................................................................................................................. 36

9.2 The institutional ........................................................................................................ 37

The Medical School and the working environment ......................................................... 37

Mentoring ........................................................................................................................ 38

Selection ........................................................................................................................... 38

9.3 Policy ......................................................................................................................... 39

Predictors of success at postgraduate level .................................................................... 39

PMQ ................................................................................................................................. 40

High stakes examinations ................................................................................................ 40

MRCGP and MRCGP Clinical Skills Assessment (CSA) ...................................................... 41

Examiner bias ................................................................................................................... 43

10 Discussion......................................................................................................................... 45

10.1 Causes .................................................................................................................... 45

10.2 Ways of researching .............................................................................................. 46

10.3 Possible interventions ........................................................................................... 48

11 Conclusions (the story so far) .......................................................................................... 50

12 References ....................................................................................................................... 52

13 Appendices ....................................................................................................................... 56

Appendix 1: Studies and other documents included in the synthesis ............................... 56

Appendix 2: Quality evaluation of studies using primary data............................................ 87

4

Table of Figures Figure 1. Flow diagram of study selection ............................................................................... 23

Figure 2 Analysis of included studies and documents by methodology or type ..................... 24

Figure 3. Publication by date ................................................................................................... 25

Figure 4. Conceptual map of themes identified in the published literature ........................... 26

Table of Abbreviations and Acronyms AoMRC Academy of Medical Royal Colleges

BME Black and minority ethnic

BAPIO British Association of Physicians of Indian Origin

CSA Clinical skills assessment

FRCA Fellow of the Royal College of Anaesthetists (examination)

GMC General Medical Council

HEFCE Higher Education Funding Council for England

HEFCE Higher Education Funding Council for England

IELTS International English Language Testing System

IMG International medical graduates

MCAT Medical College Admission Test (USA)

MRCOG Member of the Royal College of Obstetricians and Gynaecologists (examination)

MRCPsych Member of the Royal College of Psychiatrists

MCQ Multiple choice question

NBME National Board of Medical Examiners (USA)

OSCE Objective Structured Clinical Examinations

PMQ Primary medical qualification

PLAB Professional and Linguistics Assessment Board examination

RCA Royal College of Anaesthetists

RCGP Royal College of General Practitioners

RCOG Royal College of Obstetricians and Gynaecologists

USMLE United States Medical Licensing Examination

5

Executive Summary

Introduction Differential attainment is a term used to describe the variations in levels of educational

achievement that occur between different demographic groups undertaking the same

assessment. Differential attainment has been recognised as a challenge for medical

professionals and educators since the 1990s, and has been observed in both undergraduate

and postgraduate contexts. It is not specific to medical education; it is a feature of

professional education more generally.

Since 2010 the GMC has worked with others analysing data in order to better understand

the progress of trainees through their programmes and to identify any potential differences

between demographic groups. This rapid review of literature published in the period

between 2004 and the present day contributes to a wider programme of research being

carried out by the GMC to explore differential attainment across training pathways.

Research design The research was commissioned to provide a rapid review of the corpus of knowledge

relating to differential attainment. The researchers adopted a narrative synthesis

methodology in order to explore how contributions to the literature had sought to define,

measure and explain differential attainment – and therefore to identify key factors that

might be considered as having an impact upon attainment.

An initial scoping exercise highlighted that the current corpus of literature comprises

materials in a variety of formats, including; qualitative and quantitative research reports,

systematic reviews of attainment data patterns, policy documents and academic papers,

and opinion pieces and editorials.

Narrative synthesis provides a useful framework for accessing and analysing such diverse

and complex literatures. It lends itself to a ‘storytelling’ approach, by capturing a number of

different insights, evidence bases, theories and position pieces in context, and presenting

them together as an overarching narrative of differential attainment. In addition, rather

than imposing a definitive structure or sequential process, which might preclude certain

significant contributions that do not fit the initial review terms (1), narrative synthesis

6

allows researchers to move iteratively within a systematic approach – picking up on leads to

relevant information throughout the research process.

The search was conducted using PubMED, MEDLINE and PsychINFO databases, within a

search strategy that included Medical Subject Headings (MeSH) terms and text-word

searches for maximal retrieval. These searches were supplemented with further iterative

searching of reference lists, and a grey literature search of stakeholder websites. The

research team was supplemented by an expert panel, members of which were selected in

order to provide advice on search terms, to discuss the quality of the retrieved literature, to

comment on any initial emergent themes and to review the final report prior to submission

to the GMC.

We developed two frameworks against which to evaluate the retrieved papers and grey

literature: PICOC (Population, Intervention, Comparison, Outcome, Context) for quantitative

papers and SPICE (Setting, Perspective, Intervention/phenomena of Interest, Comparison,

Evaluation) for qualitative papers and other documents. These frameworks provided

transparency for our identification of included papers and other documents. A total of 39

papers were included in the synthesis with the addition of 24 documents from the grey

literature.

The literature on differential attainment The findings of narrative synthesis are grounded in the literature surveyed. The research

process does not begin with a set of a priori assumptions: instead, using this method

enables themes to emerge and be recorded as the literature is identified. The search

process highlighted that the evidence base relating to differential attainment is disparate,

that it includes a number of different research designs and variously applied methods, and

that it does not feature definitive terminology across studies. Concepts and terms are often

used interchangeably and are operationalised in research accordingly, which makes

constant or consistent comparison difficult to validate.

Overall the peer reviewed literature was of a high quality, where research aims, objectives,

methods and analyses were clearly articulated and justified. The main focus of primary

research was on the relationship between ethnicity and differential attainment in high

7

stakes examinations. While some studies are focused on undergraduate populations, some

on postgraduate doctors, and a number include both, we found that the research questions,

findings and conclusions were nevertheless relevant to understanding the emerging

narrative of differential attainment in postgraduate cohorts.

Given the limitations in the literature, we read and re-read the materials selected,

individually and then discussed as a team. During this process we used conceptual mapping

to help us understand the categories and themes arising from the entire data set. This

grounded approach led to the emergence of a three level schemata, providing three distinct

but related categories, or layers, of information on:

• the macro or policy level (investigating the political agendas and practical activity surrounding high stakes examinations)

• the meso or institutional level (exploring the impact of the medical school, training contexts and/or working environment)

• the micro or individual/discrete group level (with a focus on individuals or groups of students, doctors, examiners and so on)

Quantitative studies dominated the research base (26 studies), focusing on the macro level

and typically using large data sets to examine causal and associative relationships between

various demographic groups and different high stakes exams. The focus of the qualitative

research (5 studies) was more diverse, and explored the role of factors at the micro and

meso levels of infrastructures built to support examinations, cultural contexts and personal

interactions.

Two large scale commissioned studies in the grey literature examined the significance of

language and cultural factors for IMGs (2, 3) using mixed methods approaches, and one

further study combined a literature search with interviews.(4) In addition to this there was

one systematic review and meta-analysis of ethnicity and performance in UK trained doctors

and medical students, focusing on quantitative reports. (5)

Investigating assessment agendas and the design of high stakes examinations

The majority of studies dealing with the macro level focused on differential attainment in

high stakes exams. The research upon which this aspect of the literature is based typically

used quantitative methodologies, using large datasets, with a focus on testing for bias in the

8

exam, or a component part of it using exam data. Their conclusions are founded on typically

high quality, peer reviewed reports including clear validity measures.

Taken as a whole, these studies have broadly demonstrated the validity of high stakes

exams, and discounted evidence of bias in the nature and structure of exams themselves as

causal factors for differential attainment. However, the emerging narrative contains a

recognition that the infrastructures and processes put in place to support selection (6) and

high stakes exams may nevertheless encompass elements that lead to actual or implied bias

and/or differential attainment. (7)

Examples of this include: i) potential examiner bias through levels of concordance between

examiner and candidate in practical examinations, (8-12), and ii) the lack of a universal

terminology to classify data, which may lead to different interpretations of bias and/or

differential attainment from the exam data. An example of this is the variation in the ways

the Royal Colleges monitor for protected characteristics, which has been identified as a

potential contributor to unfair bias. (13)

The impact of institutional structures and organisational contexts

Literatures focusing on the nuts and bolts of postgraduate education, at the level of the

medical school or workplace, highlighted the paucity of well-developed research into

postgraduate selection. These contributions drew on primary data, were typically published

in high quality academic sources, and editorial comments were presented by authors with

research or experience in this area. (14, 15)

In contrast to undergraduate selection, there is little research on postgraduate selection

processes. In the literature, selection processes were presented as highly variable,(4, 6)

although it was recognised that a rigorous (or otherwise) selection process might have

implications for attainment. Best practice selection methods highlighted in the sample

involved the identification of required competencies and the development reliable

assessment methods for them. The narrative suggests that the application of a validation

process should be used to assess the predictive value of the selection methods.

Pre-entry advice and proper induction processes were identified across the international

literature as important factors for IMGs and other students who gained their PMQ outside

9

the country they wished to work in. (16, 17) One significant UK focused study (2) identified

the GMC as having a central role in developing a ‘joined up’ approach to supporting PMQs

and IMGs in addition to individual employers. (2)

Buddying or mentoring was highlighted as a useful approach to assisting acculturation. A

literature search of PubMed identified mentoring programmes for undergraduates as having

positive impacts on attainment levels but cautioned that this was relevant only if such

programmes were based on robust designs and were evaluated to ensure effectiveness. This

review demonstrated that most research in the area of mentoring to improve attainment

has been undertaken in the USA. (18)

Understanding the role of the individual or discrete group

The literature pertaining to the individual or discrete groups suggested that a combination

of factors may be associated with educational performance. These include: learning styles

and psycho-social factors; demographic characteristics such as gender and ethnicity; wider

social and cultural capital; language and other, more tacit, contributors to success. The

literature exploring these factors used both qualitative and quantitative methodologies and

was generally of a high academic quality whereby methods and findings were justified

accordingly. Two of the four studies were UK-focused: and both examining undergraduate

medical students (19) (20) two qualitative studies more narrowly focused on specific types

of student in the USA and Saudi Arabia focused on contributors to success. (21, 22)

Numerous studies focused on ethnicity in relation to analysis of differential attainment at

macro, meso and micro levels. However, whilst this issue dominated the literature, the

complexity of the term was largely unaddressed. Terms such as IMG and BME were used

interchangeably and uncritically.

For example, while “BME” is a widely used term in public and private sector organisations to

incorporate a range of minority communities living in the UK, using it as an umbrella term to

group together diverse socio-cultural demographics has been critiqued – but typically this is

not addressed in the sampling or conclusions drawn from the various studies within the

literature.

10

Whilst perhaps more obvious, IMG is another umbrella term specific to medicine that

requires clear definition, for similar reasons. The narrative emerging from the literature

identifies “IMGs” as being increasingly important to the delivery of healthcare, but

nevertheless experiencing the inherent difficulties of migration and acculturation. However,

the specifics of these difficulties, how they might vary – and why this might be important for

differential attainment of IMGs – is absent from these discussions.

Similarly, ‘language’ is cited as a predictor of good performance but it is not proven to be, of

itself, the reason why students and/or doctors fail high stakes examinations. Moreover, any

sociological or psychological examination of ‘language’ is also missing, and the concept is

treated as unproblematic in terms of its application as a potential factor underpinning

attainment.

The key narratives of differential attainment Following thematic analysis, narrative analysis was then used to identify any relationships

emerging between and across these themes. As has already been acknowledged, the

literatures are disparate and disjointed. However, there key messages are similarly

structured around: i) the potential causes of differential attainment, ii) the ways in which

differential attainment has been researched and iii) potential interventions to further our

understanding and help inform strategies going forward.

Understanding causes and relationships

The initial research undertaken into understanding differential attainment tended to focus

on the analysis of exam data with the aim of validating high stakes examinations or

identifying bias. There were 5 high quality quantitative studies included in the analysis. (7,

12, 23-25). The dominant message from these studies was that, while the reasons for

differential attainment remained unclear, they were likely to be multifactorial.

The chronological trajectory of the research demonstrates that research is increasingly

emphasising the importance of educational and social factors in contributing to

performance. In this area research is frequently qualitative. We found 8 studies, key among

which were Woolf’s analysis exploring the relevance of stereotype threat (26) and

Vaughan’s study using social capital theory to understand the role of networks and social

behaviour (19).

11

Both of these studies focused on undergraduate medical students, but provided a way of

analysing differential attainments that bear relevance for postgraduate patterns. In terms of

studies examining the attainment levels of postgraduate students, Illing’s and Roberts’

studies were the most extensive in terms of scope and data analysed. (2, 3)

The general point to draw from this development of research foci in both undergraduate

and postgraduate fields (and one that suggests we may be best served by considering both),

is growing consensus that researchers should not limit their analysis purely to exam results.

Current thinking acknowledges the requirement to examine the ‘whole’ of the exam; its

support structures (both formal and informal) and features of its candidature that go

beyond demographics to attitudes and behaviours.

Selection, language and the identification of facilitators, as well as barriers, are factors that

have been emphasised across a number of studies. In much of the literature, language is

used as a proxy for communication broadly, which is an umbrella category incorporating

gesture, pronunciation and intonation etc. This is an important observation, since

communication skills form part of clinical skills assessments and these carry with them

implicit cultural assumptions relating to the doctor-patient dynamic. The message emerging

here is that lack of acculturation will impact on performance and ultimately attainment,

even if clinical skills are to an expected standard or level of competency.

The literature also identifies poor induction, lack of support for IMGs in overcoming the

difficulties inherent to migration, and career change; all as factors that may disadvantage

IMGs in becoming better trained and acculturated doctors. A small number of studies have

highlighted the importance of considering factors that support higher levels of attainment.

Qualitatively, it is important to note these contributions to the building narrative: limiting

analysis to why certain individuals or discreet groups might fail to progress along the career

pathway risks ignoring evidence that identifies why other individuals or discreet groups

succeed – all of which might help us to understand different levels of attainment along the

spectrum.

The importance of appropriate research design

For the reasons outlined above, this review included studies employing different research

methods, the majority of which undertook the quantitative analysis of primary data. In

12

order to examine the complex nature of ‘causes’, qualitative research approaches have

more recently been used to examine complex phenomenon embedded in the culture and

contexts of assessment.

This relatively recent turn to qualitative methodologies to capture evidence of complexity

adds depth of understanding to the breadth of the quantitative research literature. Indeed,

the narrative emerging from the more recent contributions to the literature suggests that

innovative research approaches are required now that complexity is acknowledged. Specific

recommendations within the literature include: longitudinal tracking, interdisciplinary

research to provide fresh perspectives, and the development of more appropriately

sophisticated theoretical frameworks.

A significant issue across the research is the lack of (either) transparency or consistent

definition around the categories of explanation. While some contributions acknowledge the

inherent difficulty in defining and categorising, it remains the case that umbrella categories

like BME and IMG, ethnic group and ethnicity have not been subjected to full interrogation.

In this sense, the development of suitable interventions to address the problem of

differential attainment is compromised by the problem of inconsistently applied definitions

and classifications across existing databases and research studies.

Possible interventions and future strategies

Overall, the differential attainment literatures suggest that a variety of factors may affect

performance and attainment. These include issues around the background and

characteristics of the individuals, the stage they are at in their medical career and the

organisational structure of different workplace settings. These might have cumulative

effects over time or ‘one-off’ effects at certain stages of their career.

Due to the variety of factors identified as potentially affecting performance and attainment

in part, the narrative emerging from the current body of knowledge recognises the need for

a complex intervention incorporating analysis of the micro, meso and macro levels of

engagement - rather than a simple intervention to establish cause and effect relationships

of single factors.

13

The first consideration in designing an intervention relates to the level at which the

intervention is required: at the individual level, the institutional level, a broader policy level,

or a complex intervention with components on each level. It is important to recognise that

any intervention targeted at a single level needs to be thought through across all levels in

case unanticipated effects at other levels emerge as a consequence

Conclusion

This review has found that differential attainment in postgraduate medical education in the

UK cannot be attributed to a single identifiable cause, but results from a subtle combination

of factors yet to be fully explored. Over time, research has moved from the quantitative

analysis of exam data towards a more cross-disciplinary approach in order to explore a

combination of educational and social factors (rather than single causes) as contributors to

differential attainment. Such an interdisciplinary approach is now presented as essential for

developing a nuanced understanding of the complexities of differential attainment across

the micro, meso and macro-structure of medical education, and is viewed as the foundation

upon which future interventions may succeed.

14

1. Introduction Differential attainment is a term used to describe variations in educational achievement by

different demographic groups undertaking the same assessment. It is a phenomenon readily

identified across the educational landscape, and research by HEFCE and others has

identified a complex range of personal, cultural, institutional and structural factors

impacting on parity.(27)

Differential attainment has been a recognised feature of medical educational achievement

since the 1990s in both undergraduate and postgraduate contexts. But interest in the

underperformance of ethnic minority doctors has been heightened in recent years in the UK

with a judicial review in the High Court (April 2014) for alleged racial discrimination against

ethnic minority doctors by the RCGP in their high stakes examinations. The legal challenge

from BAPIO was dismissed but the Judge recommended action on differential attainment

and that the RCGP should focus on training to ensure that candidates are prepared for their

examinations.

The Judicial review is often presented as the catalyst for action, whereas the GMC has been

working with others since 2010 to analyse data to better understand the progress of

trainees through their programmes. A commissioned independent review of the RCGGP CSA

identified that overseas qualified doctors, or (IMGs), were 15 times more likely to fail the

CSA, and UK qualified BME doctors were four times more likely to fail than their white

counterparts at first attempt (the difference diminished for UK BME doctors on their second

attempt but differences for IMG BME doctors persisted on second and third attempts). (28)

Recent analysis of exam data has shown that in a simple univariate analysis the same

patterns of attainment were present across speciality groups. (29)

The present literature review contributes to a wider programme of work being carried out

by the GMC to explore differential attainment across training pathways.

15

2 Background

Differential attainment is a term used to describe variations in educational achievement by

different demographic groups undertaking the same assessment. Characteristics including

gender, age, ethnicity, nationality and socio-economic status, along with medical school and

postgraduate training programme, are all factors that HEFCE have identified as having a

correlation with performance and attainment.(27)

A search of PROSPERO and the COCHRANE library revealed that there are currently no

registered substantive reviews of differential attainment specific to postgraduate medical

education. There is however a growing body of literature examining potential causes and

factors relating to differential attainment across both undergraduate and postgraduate

medical education. (20, 21, 30, 31)

3 Aims and purposes of the review The purpose of this review is to understand from the existing evidence the underlying

causes of differential attainment in postgraduate medical education in the UK and English-

language speaking countries with comparable medical education systems (USA, Canada,

New Zealand and Australia). This includes identifying different causes and/or significance of

causes across those countries, providing a conceptual framework to design interventions to

address these issues in UK, identifying possible methods for further research in this area and

rating the strengths and weaknesses of evidence that may suggest areas for future research

and/or work.

The aims of the review are as follows:

o To establish an evidence base for differential attainment in the UK and other

comparable countries

o To identify any research methods pertinent to identifying and/or understanding the

causes of differential attainment in UK postgraduate medical education

o To examine interventions that have been effective in reducing differential

attainment that may be applicable to UK postgraduate medical education

16

o To rate the quality of evidence as a ‘springboard’ for future work

4 Methodology 4.1 Rapid review Systematic reviews that engage with health policy are becoming increasingly valued by

policy makers as the evidence base becomes more complex (32). However policy makers

often require a synthesis of knowledge on emerging issues within a short time frame in

order to facilitate a timely response and/or decision. A traditional systematic review takes at

least 12 months to complete, the need to accelerate this process to produce a rapid review

requires the reviewers to undertake methodological ‘shortcuts’ to streamline the process.

There is currently no standardised method for undertaking rapid reviews, and indeed Oliver

argues that this may be counterproductive.(33) In a review of the methods used in rapid

reviews Ganann et al recommend transparency of reporting methods, in particular where

‘traditional’ processes had been streamlined. (34)

There is considerable debate about the relative merits of full systematic over rapid reviews

with rapid reviews considered appropriate to answer focused questions or as an important

intermediary step to further research where interventions are complex. Rapid reviews may

lack the depth of full systematic reviews to present detailed recommendations, but a review

comparing cases where both rapid and full systematic reviews were conducted found that

overall there was no significant impact on the final conclusions of a review. (35)

4.2 Narrative synthesis “’Narrative synthesis’ refers to an approach to the systematic review and synthesis of

findings from multiple studies that relies primarily on the use of words and text to

summarise and explain the findings of a synthesis”.(1)

The flexibility of narrative synthesis lends itself to this type of ‘storytelling’ since rather than

having a definitive structure or sequential process (1) it relies on a framework that can be

broken down into four elements, through which the researchers can move iteratively:

• Developing a theory about how the intervention works, why and for whom

• Developing a preliminary synthesis of findings of included studies

17

• Exploring relationships within and between studies

• Assessing the robustness of the synthesis

5 Methods 5.1 Development and registration of protocol The protocol for the research was developed by the core research team: Drs Regan de Bere,

Nunn and Nasser with support from the expert panel. The protocol for the research was

agreed with the GMC on 06/02/2015 and registered with PROSPERO on 26/02/2015. The

protocol was subsequently published on the PROSPERO

website http://www.crd.york.ac.uk/PROSPERO Reference no: CRD42015017130.

5.2 Search strategy The inclusion and exclusion criteria were agreed between the lead researchers and the

expert panel. These criteria set the boundaries for the research.

Table 1 Inclusion and exclusion criteria from the protocol

Inclusion Exclusion Published between 01/01/2004 and 01/01/2015

Disciplines outside medicine (e.g. pharmacy, dentistry, nursing and midwifery)

UK and countries with comparable medical education systems (USA, Canada, New Zealand and Australia).

In the English language Studies using any methodology singly or in combination and ‘grey’ literature

Studies or documents related to postgraduate, and where appropriate to undergraduate medical education

Differential attainment /success or failure

However, as the research progressed we did revisit and refine the initial criteria as we

identified gaps and leads to important relevant literatures previously excluded. For example,

while Norway was not on our original source list, while reviewing the literature we included

one Norwegian study (36) since it addressed conceptual issues we considered relevant to

18

the review (namely those surrounding gender and qualification related to working

environments). We also included a study from Switzerland examining the impact of

mentoring during postgraduate training. (37)

We searched PubMed using the following search strategy that includes MeSH and ‘free

text’.

#7 #3 AND #6 Filters: Publication date from 2004/01/01 #6 #4 OR #5 #5 (Attainment or success* or fail*) #4 "Educational Status"[MeSH] #3 #1 OR #2 #2 (postgraduate AND educat* AND med*) # 1 "Education, Medical, Graduate"[MeSH]

This search strategy was adapted for other databases like PsychINFO. We also searched

reference lists of key papers to:

1. Ensure that our search criteria was identifying key papers

2. Identify additional papers and/or grey literature

We also added to the studies found through the searches from our own knowledge of the

subject literature.

We did not consult authors directly but met several leading researchers in the field at

related GMC events 27/2/15 and 16/03/15 where there was the opportunity to discuss the

review.

We also placed a call on the GMC website for contributions from other researchers and

interested parties. This call produced no new sources of information.

The results of the searches, conversations and prior knowledge of the literature identified

prominent topic areas and issues in the medical education literature, as well as highlighting

those which have been less well documented. This information was later used to conduct

additional iterative searches in educational literature in order to fill any gaps identified.

19

As part of the selection process, we categorised relevant literature in medical education that

fell outside of our inclusion criteria i.e. studies relating to other countries. The rationale for

this was to enable decisions at the later analysis stage, to decide whether such studies might

help us fill any gaps (or otherwise).

After an initial screening of the results, we used NVivo 10, a data management software

package, to calculate the themes identified across the literature. Individual papers may

contain several foci and each is coded individually. By listing the number of studies that

reference each descriptive theme we developed a simple schema to identify gaps in the

literature. From this we conducted further iterative searches in the medical undergraduate

literature to assess if there were any generalizable findings from those studies.

We also undertook general searching of relevant stakeholder websites listed below for grey

literature.

Table 2 Stakeholder websites searched

General Medical Council British Medical Association Royal College of Physicians and Surgeons of Glasgow

Royal College of Psychiatrists

Royal College of General Practitioners

Royal College of Ophthalmologists

Royal College of Obstetricians and Gynaecologists Royal College of Radiologists Royal College of Paediatrics and Child health Academy of Medical Royal Colleges

(AoMRC) Royal College of Physicians of Edinburgh Royal College of Physicians of London Royal College of Physicians of Ireland

Royal College of Surgeons of England Royal College of Surgeons in Ireland Royal College of Surgeons of Edinburgh

UK Higher Education Funding Council for England (HEFCE)

Other representative groups: BAIPO Medical Woman’s Federation

The initial search term used in ‘Google’ was:

name of the stakeholder AND differential attainment

20

We then searched iteratively within the stakeholder websites for additional documents.

5.3 Data management and extraction In defining eligible literature formats, we included all content-relevant documents and

articles, regardless of the status of their publication. The final sample therefore included

academic studies, unpublished research, conference papers, guidance documents, opinion

pieces and so on. Editorial and opinion pieces are included since they can provide useful

insights and offer potential solutions or identify areas for thought. They will not be formally

quality assessed but we will report on the perspective from which the paper was written

(the author and their background) and how this may have contributed to the shaping of

his/her argument.

We developed frameworks that disaggregated the elements of the research question,

against which to map the papers. Due to their structured nature, quantitative studies

tended to relate to the elements of the PICOC framework (Population, Intervention,

Comparison, Outcome, Context), whilst qualitative studies were typically more effectively

interrogated using the SPICE framework (Setting, Perspective, Intervention/phenomena of

Interest, Comparison, Evaluation). The frameworks provided a transparent method of

identifying papers to include and exclude from the synthesis.

We found no randomised or non-randomised controlled trials. Most studies focused on

evaluating certain factors like gender and ethnicity on the performance of the students.

Therefore, we have used a modified version of PICOC and SPICE frameworks for the final

synthesis presented in this report. This is still consistent with our methodology in the

protocol registered with PROSPERO (CRD42015017130).

5.4 Quality assurance Due to the inclusion of a wide variety of material in the final synthesis, and the iterative

method of study and document extraction, the transparency of all decisions made about

inclusion is guaranteed by thorough documentation of each stage of the review and the

decision-making processes.

We undertook a quality assessment of the studies that included primary data using an

adapted version of the Critical Appraisal of Qualitative Research (CASP) framework. We used

21

this for both qualitative and quantitative studies since the key issues around quantitative

studies related to the approach to the questions, the design of the research as related to the

question, the study’s population and what was measured and how. The ratings of the

studies (high quality / unclear quality / low quality) are included where appropriate in

Appendix 1 and a fuller description of the evaluation of each study using primary data is

included as Appendix 2. We included a question related to generalizability of the study

(direct / indirect / unclear). This question does not contribute to the quality evaluation but is

reported separately to account for generalizability to the review.

The research team was supplemented by an expert panel to advise on search terms, discuss

the retrieved literature, any initial emergent themes and review the final report prior to

submission to the GMC. The Expert panel (Sam Regan de Bere, Suzanne Nunn, Mona Nasser,

Paul Lambe, Julian Archer, Martin Roberts, Tom Gale and Rebecca Pitt) have met to discuss

various stages of the review, including: feeding back on the research design; ratifying the

protocol; agreeing the selected academic literature; discussing themes emerging from the

literature; quality assessment and agreeing the structure of the final report.

During the process of the research the panel agreed that the retrieved literature was

representative of the field and that the search terms used had been appropriate. The panel

did not consider that there were any significant gaps in the literature: they suggested that,

rather than reinforcing extant knowledge by including the literature from other health

professions, the research team should concentrate on the emerging narratives and look to a

broader cultural literature to inform the socio/cultural and pedagogic narratives that were

emerging if required.

The panel did identify a lack of clarity in the terminology used in different studies across the

literature: in particular the words ‘performance’ and ‘attainment’ have been used

interchangeably. The panel suggested that, for the purposes of this review, the following

definitions should be applied: attainment would be used in reference to a direct

measurement, namely passing exams, whereas performance would refer to academic

performance as a process which implies a temporal element, with attainment being a

consequence of performance.

22

6 Research Ethics The research for this review is desk based and ethical permission was not required.

7 Data Analysis Initial database searches identified 3,044 potentially relevant documents. Duplicates were

removed (68) leaving 2,976 documents to be screened by title for possible inclusion in the

synthesis. Documents rejected at this stage, after exclusions were applied, were categorised

in case any gaps were identified in the literature and these documents needed to be

revisited. Ninety six documents were retrieved as papers for further review (10% of these

being checked by SRdB against the inclusion criteria). From this tranche 40 papers were

evaluated against the PICOC and SPICE frameworks, as described in the protocol, 8 failed on

one or more of the criteria, leaving 32 documents extracted for discussion by the expert

panel and potential synthesis. Following discussion a further three papers were added on

the advice of the expert panel from their subject knowledge and 4 papers were added as a

result of iterative searching of the reference lists in the papers identified for synthesis.

A total of 39 papers were included in the synthesis with the addition of 24 documents from

the grey literature. A flow diagram of the search process is shown in Figure 1 below.

23

Figure 1. Flow diagram of study selection

24

The studies and other documents included in the synthesis use a variety of formats and

methodologies. Shown in Fig 2 below

Figure 2 Analysis of included studies and documents by methodology or type

Quantitative research is the dominant research methodology for published research.

Interestingly mixed methods research studies were only found in the grey literature. The

‘other category’ includes opinion pieces, letters and comment, conference and other

reports. Not surprisingly this is the area dominated by the grey literature.

Fifteen of the documents extracted from the grey literature were comment pieces in the

online medical news and media, Pulse (n = 4), BMA (n = 6), GPonline (n = 1), BMJ Careers (n

= 3) and Mancunian Matters (n = 1). The most disseminated document in the grey literature

was the AoMRC 2013-14 review (38) it was linked to the Royal College sites and returned as

a ‘hit’ when searching them. The document itself has little to say about differential

attainment: a short paragraph identifying the judicial review as a catalyst for AoMRCs

decision to “look at the wider question of differential attainment in medical education.” (38)

An examination of the dates of publication of the included documents testifies to a growing

interest in differential attainment. This is with the caveat that there is a time lag between

2

25

4

8

1 3

20

0

5

10

15

20

25

30

Lit review Quantative Qualitative Mixed method Other

Published Grey

25

academic research and its publication that does not apply to online comment. But even

taking this into account a trend is clearly discernible.

Figure 3. Publication by date

Broadly speaking, the peaks of interest roughly coincide with significant changes to the

MRCGP in 2010 (specifically the CSA component), the publication of Esmail and Roberts

report in 2013 and the Judicial review in 2014.

8 Narrative synthesis A narrative synthesis does not begin with a set of a priori assumptions. Using this method

themes emerge as the literature is identified and reviewed. The first level of thematic

identification is descriptive and can be generated in a number of ways including coding

followed by conceptual mapping to help us think about the relationships between and

across the themes identified.

Using the themes coded in NVivo 10 we identified two key areas of interest that emerged

across the literature: high stakes exams and ethnicity.

Fig 4 shows a conceptual map of the relationship between high stakes exams and ethnicity

in the published literature, with the sub-themes or factors either identified or investigated.

0

2

4

6

8

10

12

14

2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

Published Grey

26

Figure 4. Conceptual map of themes identified in the published literature

N.B. the size of the ovals does not reflect significance of factor or quality of the research

The conceptual map, while amply demonstrating complexity, also provides a way of

populating a micro, meso, macro analytical framework that broadly relates to three key

levels of engagement: the individual or discrete group (student/s, doctor/s, examiners etc),

the institutional (medical school or work environment) and the level of policy (exams) .

9 Findings 9.1 The Individual or discrete group Although not discussing postgraduate education specifically, Schrewe makes a number of

insightful observations around the place of the individual in medical education and the

tension between the competing discourses of diversity (respect for the culture, gender and

ethnicity of individuals) and standardisation (uniformity and consistency).(39) Arguing that

these discourses need to be made explicit and “brought into the same conversation” in

27

order to enable students and trainers to achieve their full potential, Schrewe suggests that a

better understanding of the common qualities required and the extent to which individual

variation can be supported without detriment to the profession as a whole is the question

that needs addressing with some urgency.(39)

In this section of the findings we discuss themes identified in the literature pertaining to the

individual or discrete demographic group.

Study habits Woolf examined ‘study habits’ as part of wider research into ethnic underperformance in

Year 3 medical students using a questionnaire to assess surface, deep and strategic learning

processes. Deep learning is associated with an active search for meaning, whereas surface

learning is associated with memorising rather than understanding. (40) Woolf found that

minority ethnic students scored lower on deep learning study habits (p = .003) and higher

on surface learning study habits (p =.008) than their white peers. (20) Strategic learning,

where learners adopt the best learning style to fit with the needs of the task was identified

by Woolf as positive predictor of performance but was statistically related to other factors

including, living at home and having English as a first language. It is also important to

recognise that students should not be identified with a fixed approach to learning;

curriculum design, assessment and teaching style all encourage students to adopt a

particular approach. (41) This suggests potentially broader questions about ethnicity and

learning.

Psycho-social Psycho-social is a term used to describe an individual’s psychological development in, and

interaction with a social environment. In the literature on widening participation psycho-

social factors in relation to undergraduate degree choice are well documented. (42)

As part of a larger undergraduate research study Woolf examined personality types of white

and non-white students using an adaptation of the NEO-PI-R (43) to identify five personality

types (neuroticism, openness to experience, agreeableness, extraversion and

conscientiousness). The study of a total of 703 (51% minority ethnic) students found that

ethnic minority students were lower on the personality trait “openness to experience” (p =

28

0041) (20) but this was not found to have a negative effect on final year examination

performance.

Social and cultural capital The ‘standing’ that medical professionals have within different cultures has been shown to

have a significant effect on the choice of medicine as a career. A study linking informed

choice and academic success in Iranian medical students provides a useful international

review of studies that found many medical students having an “over-dramatized and

romanticized view of medicine at the beginning of academic studies”. (34) The Iranian study

used a multiple choice questionnaire (n = 2208) for final year medical students and found

that informed choice had a positive effect on attainment.

Success Esmail recommends that more research is undertaken into factors for ethnic students’

success. (28) Whilst not looking at ethnicity, we identified one small scale qualitative study

using interviews with 10 black male medical students and 3 black male physicians at Florida

State University College of Medicine to explore their perceptions of the factors contributing

to their success in being admitted to and graduating from medical school. (22) The study,

with its gender, geographical and numerical limitations, never the less presented an

interesting line of enquiry looking at contributors rather than barriers to attainment.

The study concluded that factors contributing to success were a balance between

educational experiences, exposure to medicine, psychosocial-cultural experiences (including

family and other support networks) and personal attributes. Participants in the research

specifically identified structured activities like enrichment programmes and outreach

programmes as significant. The Minority Association of Pre-Medical Students Programme

(MAPS) was an example cited by the study participants. MAPS provided opportunities for

networking with other premedical students, medical students and physicians and

importantly provided the opportunity for shadowing experiences.

We then looked at the undergraduate literature to see if there were any other studies that

looked at contributors. One qualitative study based in Saudi Arabia used focus groups to

understand 19 mixed gender high achieving medical students perceptions of factors

contributing to their success.(21) They identified learning strategies, resource management

29

including family support, motivation and the efficient management of non-academic

problems i.e. stress.

In a study examining the differential achievement between white medical students and their

ethnic minority peers, Vaughan (19) used social capital theory to develop and analyse survey

data from medical students in the clinical phase of their training (n = 158). The research

found no link between ethnic and religious homophily and achievement. However,

interacting with problem-based learning group peers in study related activities and having a

wider academic support network were found to be directly linked to better achievement.

Vaughan concluded that ethnic homophile may cut minority students off from potential and

actual resources that facilitate learning and achievement. Therefore it is key that students

build wide relationships with colleagues at all levels of training.

Ethnicity The underperformance of ethnic minorities compared to their white peers across the higher

education landscape has been consistently identified.(44) (45) The studies discussed in this

review focusing on the performance of UK-trained medical students and doctors from

minority ethnic groups have corroborated broader HEFCE findings.(27)

Definitions of ethnicity are numerous and complex. In the UK studies we discuss in this

review ethnicity was either self-declared specifically for an individual study or was a

characteristic already identified in a data set being analysed.

All these papers evidence a mix of educational and social factors as

contributing to performance of individuals in addition to individual

characteristics

The literature examining contributors to success is important since by

only looking at why certain students might fail only tells half the story.

From the papers found, contributors to success seem to be international

but with such few studies the results are not generalizable.

30

Classification systems used in the research also varied, and included the 2001 UK census

guidelines, (19, 20, 26) individual Royal College geographical bands, (46) white and non-

white, (47) BME as an umbrella group, (5) (11) (24) categories approved by UK Commission

for racial equality, (9) GMC National Training Survey (23) which uses UK census categories.

Studies use different categorizations and therefore comparisons between studies can be

difficult. For example, Denney cites the conflation of all BME groups under one heading as a

limitation of the study but states that it was necessary in order to compare and contrast the

results with other studies and because the numbers were too small in some sub groups. (11)

Woolf adopts the same approach, arguing that ethnic categories are to an extent artificial

because they can never take into account the subtle variations between groups of people.(5)

There have been a number of key large scale quantitative research projects since the 1990s

focusing on ethnicity and differential attainment. The catalyst for this area of research was

the identification of a higher failure rate in clinical exams among non-white students at the

University of Manchester the 1995 (48): the leading researchers in the field in the UK are

Chris McManus, Katherine Wolf, Jane Dacre and Richard Wakeford.

In a systematic review of ethnicity and academic performance in UK undergraduate and

postgraduate medical students, Woolf found ethnic differences in attainment to be

widespread across different types of medical school and different types of exam at both

levels of study.(5) The review focused on quantitative reports that measured performance

and concluded that differential attainment was both “consistent and persistent”: but while

ethnicity was clearly related to exam performance the reasons for this were not clear.(5)

The first large scale longitudinal study exploring in depth a number of potential

psychological and demographic reasons for differential attainment in undergraduate and

postgraduate medical students was led by Katherine Woolf (20).

In contrast to the studies, focusing on measuring differences in attainment between

different groups, Woolf’s qualitative study (26) using focus groups and semi-structured

interviews (n = 27 medical students and 25 clinical teachers) followed earlier studies in the

US and examined the potential of stereotype threat to provide an insight into the identified

gap in attainment. Stereotype threat has been identified as a psychological phenomenon

31

whereby individuals who are members of a group characterized by negative stereotypes

perform below their actual abilities when group membership is emphasized. Woolf found

that negative stereotyping could impact on the relationship between lecturer and student

and therefore affect learning. She concluded that while a negative stereotype about an

ethnic group had “numerous implications for teaching and learning” the relationship was

neither simplistic or deterministic.(26) Woolf concluded that the student/teacher

relationship was “vital for clinical learning” in particular the negative Asian stereotype was

considered to be potentially jeopardising to Asian students relationship with their teachers.

Woolf recommends that employers should facilitate teachers in getting to know their

students as individuals. Although the study was limited to one London Medical School

stereotype threat is an interesting line of inquiry – not just relevant to ethnicity – for

example Burgess has studied gender in terms of stereotype threat in the context of career

advancement in Academic Medicine in the US. (49)

IMG IMGs are an important asset to the Health Service in the UK. In a review article in 2005

Sandhu opined that increasing numbers of IMGs would be needed to achieve the rapid

increase of workers needed as a result of legislation relating to the creation of a consultant

based service, and other working directives. (50)

Definitions of ethnicity are numerous and complex.

BME is a widely used term in public and private sector organisations to

incorporate a range of minority communities living in the UK. Such an

umbrella term has been critiqued in terms of the validity of grouping

together diverse groups in this way.

Conversely for quantitative studies broad terms may need to be used to

obtain statistically significant results

32

Sandhu raises the concern that this requirement combined with the UK being a very

attractive place for medical graduates to work and continue their training could encourage

an influx of inexperienced doctors or doctors having poor communication skills seeking

opportunities in the competitive specialities. Sandhu advocates that more realistic

information about postgraduate opportunities and training be available to enable potential

IMGs to make a more informed choice, but also praises the motivation and determination of

IMGs as a group.

A study in the US found great persistence on the part of IMGs in pursuit of a US residency

position.(51) The linked data study of a cohort comprising 10,328 IMGs who were both US

citizen IMGs and non-US IMGs highlighted the importance of IMGs to the delivery of

national healthcare.

In a large scale analysis of RCOG data Rushd undertook retrospective analysis on the

performance of IMGs who appeared for the first time in the Part 1 (n = 11,863) and Part 2

written (n = 5336) MRCOG examinations between 2000 and 2010. (46) Rushd’s evaluation of

the first time performance of IMGs in the MRCOG part 1 and 2 written examinations

critiques IMG as a category by identifying variation in performance between students across

the RCOG geographical bands.

Rushd was unable to perform statistical comparisons to the results of the study since

geographical bands are not comparable: they contain different countries, different

academic standards, different teaching methods etc. Rushd however, found that variation of

IMG performance was likely to be multifactorial and suggests that the introduction of e-

learning modules may “go some way in equalising the learning opportunities among

geographic regions and could prove useful for both trainers and trainees.” (46)

Aside from Illing’s study, discussed below, (2) we only found one qualitative study examining

barriers and facilitators encountered by IMGs. The study was situated in the Netherlands

and the findings related mainly to sociocultural rather than educational factors, including

being able to access information and financial support. (16) Lack of command of the Dutch

language (particularly the medical terminology) and age were seen as barriers to securing

employment and entrance to specialism. Age was only a barrier in some specialisms since

they set an upper age limit for postgraduate specialist training.

33

The study concluded that better support to overcome difficulties inherent to migration and

career change would result in better trained and acculturated doctors. The GMC has

recently undertaken some work in this area and developed a ‘Welcome to UK Practice

programme’ to raise awareness about practice in the UK. (52)

In contrast to Vaughan, who cautioned against homophily, (19) a presenter at the RCPsych

conference (2014) encouraged IMGs to join and become active in diaspora organisations,

thereby familiarising themselves with working in the NHS and broadening their network of

professional contacts. (53)

The RCPsych convened a conference in 2014 to focus on familiarising IMGs with working in

psychiatry in the UK. The conference was organised in recognition “that IMGs face more

problems than British graduates in succeeding in the system.” (53) The college is keen to

support IMGs by commissioning an external review of the MRCPsych exam and ARCPs and

appointing an Associate Dean for Trainee Support.

Feedback from the delegates was positive and the college plans to run another in

2015.There was a recognition by delegates of the importance of trainers, the role of

employers in developing meaningful induction programmes and giving IMGs additional

support and remediation if required. Among the recommendations proposed at the

conference were that the College appoint local and national IMG Champions and improve

examiner training to help recognise unconscious biases (accents, manner etc).

IMG is a category that needs to be problematized and properly defined.

The literature identifies IMGs as increasingly important internationally to the

delivery of healthcare.

IMGs are noted for their persistence and tenacity in pursuing postgraduate

qualification.

IMGs face the inherent difficulties of migration and acculturation. These include

language, accessing information, financial support and limited knowledge of the

healthcare system.

34

Language A number of studies discussed language either as a sole focus or as part of a number of

compounding factors. Woolf’s longitudinal study using exam data and questionnaires over

two consecutive year 5 cohorts (n = 703: 51% minority ethnic) found that speaking English

as a first language, with one parent also speaking English as a first language and being

schooled in the UK, was a predictor of good performance in final year UCL medical students.

However not having this level of English was not the reason why minority ethnic students

underperformed. She suggests that where examinations like the OSCE require

communication skills “country of schooling could be a proxy for communication or cultural

differences.”(20)

This finding concurs with those of Watmough (21), discussed above, who was also unable to

identify language as a determining factor in success in the RCA postgraduate examination.

The most significant study exploring language and cultural factors, was undertaken by

Roberts and funded by the Economic and Social Research Council (ESRC). (3) This study

used a sociolinguistic methodology to examine both how candidates performed in the RCGP

exam but also how the specific conditions of the exam operated to determine behaviour.

In specific relation to the CSA, but with wider implications for other practical exams in both

undergraduate and postgraduate contexts, Roberts’ study found the “relatively

decontextualized nature of the CSA made it a ‘talk-heavy’ assessment from which a number

of effects flow”. These include “communicative performance factors’ which relate to how

IMGs talk and interact with role playing patients, examiner perceptions of candidates

sounding formulaic and not engaging with the patient through a patient centred model.”

The researchers suggest that the sociolinguistic “fingerprint” of the exam which assumes a

patient centred approach could constitute a “hidden curriculum.”(3)

The study concludes that “Rather than talk of ‘cultural bias’ or not, there needs to be a

debate about tolerances and communicative flexibility, about what are acceptable

competencies in an increasingly diverse society and how, within these competencies, talk

and interaction can be more explicitly addressed. ‘Cultural bias’ implies that there is a goal

of neutrality that must be reached and that there is one ‘culture’, one way of doing

things.”(3)

35

Memon argues that oral examination is an important element of postgraduate examinations,

but ensuring its reliability and validity across specialisms is complex to design and

implement. (35) Memon cites the work commissioned by the RCGP in this area of

postgraduate examination as an example of good practice in providing an evidence base for

the validity and reliability of the oral elements of their exam. Memon cautions that IMGs

taking exams in other specialities may be disadvantaged if their English is less fluent and

articulate than UK trained candidates.

Knight, an MRCGP examiner, argues in an editorial piece that while there is evidence that

the MRCGP is reliable; IMGs are prone to failure because the exam is in English and they

spend much of their practise consulting in other languages. (36) Aside from language Knight

also cites other factors that may impact on IMG success in the MRCGP, including differing

clinical environments in the UK from the one in which they trained and that they may spend

much of their consulting time in the UK speaking in a language (or languages) other than

English.(36) Knight with Roberts identify the failure to acknowledge or assess multilingual

expertise, which both see as an asset in an increasingly diverse UK society.

The specialities with the highest proportion of IMG candidates are the MRCGP and the

MRCP (particularly psychiatry). (45) These specialities require significant levels of cultural

awareness and advanced communication skills, both of which may place IMG students at a

disadvantage. (17)

Issues around IMG students and language are not unique to the UK, but also evident in

other countries where there are minority groups. (54)

While language may be a predictor of good performance it is not, of itself, the reason why students fail.

Language is often conflated with sociolinguistic performance.

There is currently no acknowledgement or assessment of multilingual expertise.

36

Gender Two papers (both American) compared female attainment against male attainment in

obstetrics and gynaecology (Obs/Gyn) (55, 56). Both studies conclude that women

outperformed men in the Obs/Gyn) specialism.

Bibbo’s study found that on the pre-clerkship measures MCAT men outperformed women,

but on the overall clerkship scores women outperformed men. This was due to womens’

higher achievement on the standardised National Board of Medical Examiners (NBME)

subject examination. Drawing on other literature a number of proposals were made as to

why this might be the case, including men being less interested in the specialism and

consequently less motivated, combined with the perception that patients prefer a female

physician. Women in contrast being potentially more motivated because they want to enter

this specialism due to gender identification, and the dominance of women already in the

field.(55)

Cuddy’s study on examinee gender and United States medical Licencing Exam (USMLE)

performance also found men outperforming women at Clinical Knowledge (CK) step 1 of the

exam but with women outperforming men at CK Step 2 (clinical skills), and with women out

performing men in most content areas of obs/gyne, paediatrics and psychiatry: in contrast,

men out performed women in medicine, surgery and preventative medicine.(56)

In a Norwegian study of 2474 Norwegian residents who began specialization in 1999-2001

(36), Johannsen found that although women progressed more slowly than men, the gender

variation was not significant when the effects of child-birth and having children under 18

were controlled for. But gender was found to have a strong influence on choice of speciality

due to longer required working hours, for example in emergency services.

In combination these studies identify a gender split in specialisms, for example the dominance of women in Obs/Gyne.

Identified gender differences in exam performance may potentially be linked to gender motivation to succeed in specific specialisms and/or gender identification with certain specialisms

Studies suggest that changes to the hospital environment, working practices and cultures could encourage a more even gender split across the specialities.

37

9.2 The institutional

The Medical School and the working environment In a Norwegian study (36), Johannsen looked at hospital specific factors in speciality choice

and qualification. The study found that hospital factors were significant predictors for the

participants (n = 2474) timely attainment of specialization. Working at university hospitals

(regional) or central hospitals was associated with a reduction in the time taken to complete

the specialization, “whereas an increased patient load and less supervision had the opposite

effect.” Johannsen’s study suggested that more flexibility in the curriculum would be

beneficial.

Illing, using quantitative and qualitative data, describes how senior overseas doctors who

come to the UK with established clinical practices may find adapting to a different

workplace culture difficult and not have access to the support available to less experienced

doctors.(2) IMGs may also find difficulties understanding roles and responsibilities in the

NHS structure in addition to patient-centred culture and a holistic model of care. (2)

Two studies identified a need for a greater emphasis on Equality and Diversity and cultural

awareness in training within organisations with targeted events and diversity initiatives used

as opportunities. (3, 4)

As part of McManus’s data linkage study into PLAB and UK graduates performance on

MRCP(UK) and MRCGP examinations, a comparison between graduates from different UK

medical schools was performed. (7) The study found “clear and large differences in

performance at MRCP(UK) between graduates of different medical schools.” (7) However,

the study concluded that the identified differences in training could not account for the

poorer performance of IMGs.

Esmail advocates examining the distribution of IMGs and BME doctors across UK medical

schools in order to ascertain if the selection and training placement processes could operate

against the interests of weaker candidates, thus encouraging a cycle of educational

deprivation. (10) This observation is supported by Tiffin. (23)

38

Mentoring There is a significant body of literature around mentoring for medical students and doctors

at all levels of study, with the majority of studies being undertaken in the USA. (18) Frei’s

review concludes that mentoring is “an important career advancement tool for medical

students” and that more programmes should be set up in Europe, but monitored and

assessed for impact.(18)

In terms of mentoring in the context of postgraduate medical training the literature is not

well developed, although there is support from the Royal colleges and the NHS generally.

(57) (58) Stamm’s study examining mentoring as part of a developmental network, set in

Switzerlamd, found that only 50% of doctors undergoing specialist training (n = 326) took

advantage of mentoring despite the positive benefits identified and of those, females

received less mentoring than their male colleagues. Reasons for this gender gap were

identified as primarily due to extraprofessional concerns. Stamm concludes that given the

often less straightforward career path for females mentoring is particularly important.(59)

Steven’s qualitative study over six NHS sites, identified benefits across the professional-

personal interface. Steven suggests that successful mentoring makes doctors feel more

confident and satisfied in their work, and this will have beneficial impacts for organizations.

(60)

Selection The literature on postgraduate selection is less developed than that of undergraduate

selection into medical school. The UK general practitioner selection process, uses a national

machine markable shortlisting test to assess both cognitive and non-cognitive skills and a

‘corporately owned’ and validated selection methodology. Plint (42) summarises the success

of the process and the confidence in it from both students and deaneries as: “Corporate

commitment to national process; legitimate authority and locus of control; process of

incremental convergence, rather than imposition; development and adoption of validated

selection method; representative infrastructure operating the process.”

McManus, undertook a significant study examining the educational background and

qualifications of UK students from ethnic minorities and the selection for medical school.

(47) The study addressed the assumption that entrants to medical school are equivalent in

39

their academic ability and that following on from this differential attainment in

undergraduate medical exams and beyond were accounted for at some point after selection

to medical school. The study found however, that non-white students had slightly lower A

level and GCSE grades than their white peers. Concluding that while GCSE and A level grades

might explain some of the effects found, they could not entirely explain the poorer

performance of non-white students at medical school and beyond.

Citing the GP selection process as an example, Paterson recommends a more robust

selection process.(4)

9.3 Policy

Predictors of success at postgraduate level Woloschuk’s small scale study (n = 244 medical graduates) at the University of Calgary,

Canada found that measures of undergraduate performance seemed to be poor predictors

of postgraduate success. In particular they found a ‘weak’ relationship between

performance in the Medical Council of Canada (MCC) national licencing exam, which they

describe as a “rite of passage” to postgraduate training, and residency. They suggest that

success may be due to non-cognitive attributes for example, work ethic, personality and

motivation (61).

Understanding workplace culture is important.

Mentoring programmes are beneficial but they need to be robust and evaluated to ensure they are effective.

The literature on postgraduate selection is less developed than that of undergraduate selection into medical school.

Prior attainment could not entirely explain the poorer performance of non-white students at medical school and beyond.

Selection processes for postgraduate study are highly variable.

40

PMQ In a high quality comparative study of UK trained doctors and those whose PMQ was gained

outside the UK sitting the RCA exam from 1999-2008, Watmough (62) found that candidates

from Egypt, Iraq, Ireland and Pakistan performed significantly worse than those from

Australia, New Zealand, South Africa, Zimbabwe and the UK. From June 1990 to February

2008, there were 9,315 attempts at the MCQ by 5,797 graduates from 70 countries, with 25

countries having candidates who made 15 or more attempts. The analysis was undertaken

using data from the written part of the exam which uses multiple choice questions to test a

range of generic clinical skills. The MCQ is a high stakes exam essential for career

progression to consultant level. The study did not find a coherent pattern to attainment and

concluded that “some IMG graduates who sit UK postgraduate exams may require

additional support prior to taking the exam.” Importantly, the underperformance of

students from Ireland and Pakistan, where English is the main language in medical

education, indicates that language is not a key factor in differential attainment in this exam.

The authors suggest that rather than language it may be that cultural ties ease the transition

of working in the UK, however the poor performance of candidates from the Republic of

Ireland casts doubt on this supposition.

High stakes examinations High stakes exams all contain a number of components, assessment of practical skills using

‘real’ or simulated clinical scenarios, multiple choice, written, oral – different elements of

the exam are marked in different ways: by computer, by examiner and by assessment of

skills.

It is important that the transparency and fairness of ‘high stakes’ exams be demonstrated

given the influence they have on a doctor’s career progression and employment

opportunities. Memon, in relation to the specifics of oral examination in postgraduate

examinations argued for the Royal Colleges to undertake much more rigorous validity and

reliability testing on their high stakes exams. (63)

Wakeford’s large scale assessment of validity and differential performance by ethnicity in

the RCGP and MRCP(UK) examinations followed in the wake of a Judicial Review.(24) It

sought to evaluate if the performance of candidates in the MRCP(UK) was predictive of their

41

attainment in the MRCGP (usually taken 3-4 years after). The study found substantial

correlations between a candidates performance in the two exams which provides support

for the validity of each. (24)

Wakeford identified a higher correlation between PACES and the new CSA than the old,

suggesting that the new CSA is a more valid assessment. (24) in addition the study found

that in particular BME candidates showed a higher correlation between PACES and the CSA

than white students “suggesting that there is less extraneous variance between BME

candidates making it a more valid assessment.”(24)

MRCGP and MRCGP Clinical Skills Assessment (CSA)

The CSA exam was revised in the autumn of 2010 to improve the reliability of the

assessment. In Esmail and Roberts key study, using previously unavailable data from the

GMC and the RCGP, they examined ethnic minority candidates performance in the MRCGP

exams between 2010 and 2012: therefore testing the new CSA.(28)

The headline conclusion was that “subjective bias due to racial discrimination in the clinical

skills assessment may be a cause of failure for UK trained graduates and international

medical graduates.”(10)

Discrimination is an incendiary term. Judith Hawkins writing in Mancunian Matters

explained the format of the CSA to its non-medical audience and outlined Esmail’s findings.

The article stimulated a heated online debate among readers who were only too willing to

support claims of racial discrimination.(64)

Esmail and Roberts suggested that the different training experience and other cultural

factors (patient/doctor relationship and proficiency in spoken English for example) between

UK and non-UK trained candidates could affect exam outcomes. However they did not

consider that these cultural factors could entirely account for differential attainment

between white and BME UK trained candidates. It was suggested that discrimination could

occur at a number of points in the CSA: the behaviour of standardised patients to white and

non-white candidates and bias on the part of the examiners.(10)

McManus’s study (7) leads on from this study by Esmail and Roberts (10) although there are

significant differences between the two: McManus’s study analyses PLAB part 1 (which

42

Esmail does not) and it analyses a larger dataset (n = 7,829) from MRCP(UK) compared to (n

= 5,055 candidates + 1,175 not trained in the UK). Both studies use candidates’ marks at 1st

attempt for all analysis. McManus’s study found that IMGs lower performance was

“unlikely to result from systemic examiner bias or discrimination.”

Knight states in an opinion piece, that while there is evidence that the MRCGP is reliable

IMGs are prone to failure because the exam is in English and they spend much of their

practise consulting in other languages. (15)

The CSA exam was revised in the Autumn of 2010: the new CSA has been found not to

discriminate between white and BME candidates. (24) However, the CSA will inevitably carry

implicit cultural association’s specific to UK medicine. Esmail states that the CSA is not, and

was not intended to be a culturally neutral exam. Therefore UK graduates are likely to be

initially more successful, because they are acculturated. (28)

The CSA was consistently identified in the medical news media as a particular issue for IMGs.

Commentary was prompted by both Esmail’s study (65, 66) and the judicial review. (67) (68)

MRCOG

The MRCOG is an internationally recognised standard and at the time of Rushd’s study more

than 85% of the total candidates were IMGs. The study found that MRCOG examination

success rates were significantly different according to the university of medical graduation.

Rushd also identified a variation in performance among graduates from different medical

schools in the Part 1 and 2 of the MRCOG written examination which was comparable to

those school’s performances on the MRCP (UK). (46)

PLAB and IELTS

If IMGs are going to sit the PLAB they need to demonstrate that they have achieved an

acceptable level of English via IELTS in the previous two years. PLAB was reviewed in 2011 to

assess whether the knowledge and skills demonstrated by passing the PLAB continued to be

equivalent to those demonstrated by an F1 doctor. A key component of this review was to

examine any disparity between IMGs, who successfully passed the PLAB test, and their UK

graduate peers in postgraduate examinations.(7) Aside from difficulties relating to direct

comparison the study concludes that there are good correlations between PLAB and the

43

MRCP(UK) and MRCGP which means that PLAB is a valid assessment of skills relevant to

progression during UK postgraduate training. It should be noted however, that PLAB is not

designed to predict postgraduate exam performance, or to ensure that those passing PLAB

can achieve at postgraduate level.

In order to produce outcome equivalence between IMG and white graduates it was

suggested by McManas that the PLAB pass mark could be set higher – however this would

have significant impacts on health service delivery.(7)

In Tiffin’s study of UK based trainee doctors with at least one competency related ARCP

related outcome (n = 53,463 of whom 11,419 were IMG registered following a pass from the

PLAB route) in the study period also found that the PLAB test was not generally equivalent

to the requirements for UK graduates. With the standard of English competency and the

PLAB pass mark needing to be raised to ensure equity.(23) Tiffin also discusses how PLAB

candidates with lower scores may not be able to secure a post in their preferred specialism

and therefore successfully apply for “shortage specialities” like psychiatry and general

practice. Given that these specialisms require enhanced communication skills some IMGs

may immediately be disadvantaged. (23)

Sandhu notes that the requirement to pass these exams cuts into IMGs time and can cause

the erosion of time for research resulting in IMGs CVs being weak in publications which can

impact on them being shortlisted for jobs in spite of clinical experience.(50)

Much of the research into differential attainment is quantitative with a focus on testing for

bias in the exam, or a component part of it using exam data. Taken as a whole these studies

have broadly demonstrated the validity of some high stakes exams and discounted evidence

of bias in the exams themselves leading to differential attainment. This view is endorsed by

Patterson with the caveat that it is not an endorsement of all assessment tools.(14)

Examiner bias Examiner bias in relation examinations like the MRCGP and the MRCP in which candidates

are judged ‘live’ and therefore examiners can identify a candidate’s gender and ethnicity has

been frequently questioned (11) (8, 9).

44

Examiner bias is a potential risk in any examination and a threat to the validity of an

examination. The first study in this area by Dewhurst (9) focused on the MRCP(UK) and

found any potential examiner prejudice was only significant when two non-white examiners

examined a non-white candidate. This Dewhurst suggested was not conscious and may

relate to a consistency in communication style and cultural understanding.

McManus’s investigation into possible bias as a threat to the validity, used data from

MRCGP(UK) PACES and nPACES examinations.(8) The study found that having two

independent examiners reduced any potential for bias and judged it a preferable method of

assessment over a single examiner. This is an example of how the infrastructure around an

exam can potentially impact on the outcome for the candidate

Denny’s study to investigate potential examiner bias as responsible for differential

attainment in the MRCGP CSA, found no evidence to support examiner bias. Finding that

differential attainment was linked to the candidates’ demographic rather than the

examiners. (11)

In a letter to the BMJ, Shaw opined that the new CSA high failure rate of ethnic candidates

was an unintended consequence of selection and examination rather than examiner bias.

With the higher failure rate due to the combination of a disincentiveisation for medical

schools to increase trainee numbers and the raised standards of English required by the new

exam. (69)

The BMA suggests that considerable variability between Royal Colleges in the ways in which

they monitor for protected characteristics in exam candidates could be a potential

contributor to unfair bias and differential attainment in specialist examinations. (70) This

was endorsed in BMJ Careers. (13)

In addition the BMA proposes that colleges should annually monitor the diversity of

examiners and actors and cross reference this data with individual candidate’s performance.

45

10 Discussion In this discussion section we return to the research question and structure the discussion

around the three key areas of interest: potential causes of differential attainment identified

in the literature, ways in which differential attainment has been researched and finally we

discuss potential interventions. This segmentation, it should be stressed, is too an extent

artificial since the three elements are interrelated. The identification of possible causes may

suggest ways of researching, from which interventions can be developed, causes may

suggest interventions, specific research methodologies may identify possible causes etc.

10.1 Causes In the AoMRC statement of principles there is a recognition that differential attainment

does not in itself demonstrate that the exam, curricula content, process or delivery is

discriminatory.(71) This document cites the GMCs independent commissioned report which

found that “the method of assessment was not the reason for the differential outcomes.”

(28)

Studies are tending to move away from the analysis of exam data and towards less tangible

educational and social factors as contributing to performance. The general point to draw

from this development of research over time, is not to just look at results but look at the

‘whole’ of the exam and its candidature. This includes both the pragmatic decisions made by

candidates (why they chose a specialism, for example), and the factors beyond the

assessment process, for example the way Royal Colleges record protected characteristics

data and the impact this may have on demographic analysis.(70)

Much of the research into differential attainment is quantitative with a focus on testing for bias in the exam, or a component part of it using exam data.

Taken as a whole these studies have broadly demonstrated the validity of high stakes exams and discounted evidence of bias in the exams themselves leading to differential attainment.

It is important to acknowledge how the infrastructure around high stakes exams may also lead to bias and/or differential attainment.

46

Highly variable selection and induction processes for postgraduate study in the UK have

been identified. The RCGP selection process is well developed and can be robustly defended.

However, poor induction and lack of support for IMGs in overcoming the difficulties

inherent to migration and career change may disadvantage IMGs in becoming better trained

and acculturated doctors. The role of trainers and employers is potentially pivotal to

supporting IMGs. In particular identifying cultural specific needs of IMGs will be helpful in

supporting IMGs adjust to the realities of practice in their adopted country.

Language, as a factor has been emphasised across a number of studies. However, quite

what is meant by language is not always clear. Although not stated explicitly ‘language’ is a

category of explanation that includes more than just words, it is often conflated with

‘communication’ in the context of the CSA. One critique of the CSA and other similar

assessments is that they carry with them implicit cultural assumptions and lack of

acculturation will impact on performance and ultimately attainment, even if clinical skills are

to an expected standard.

Differential attainment is not just about ‘failure’, understanding why students succeed is

also important and could provide pointers to factors that need to be encouraged/facilitated

in order to improve student performance.

From the papers found, factors leading to success seem to be international but with such

few studies the results are not generalizable. Factors identified might therefore be unique

to given student communities and a UK based study could be helpful because by only

looking at why certain students might fail only tells half the story. The information gained

from such a study could be used by medical schools to inform both their teaching and their

widening participation initiatives.

10.2 Ways of researching Patterson suggests that more open dialogue between stakeholders is essential if the

complexity of differential attainment is to be fully understood and addressed. (14) She

suggests more innovative research approaches are needed: Including, longitudinal tracking,

interdisciplinary research to provide fresh perspectives and the development of appropriate

theoretical frameworks. In particular she advocates detailed case studies of “outliers” as a

way of approaching identifying facilitators to success.(14) Woolf’s exploration of stereotype

47

threat (26) is an example of research that adopts a fresh perspective by drawing on other

disciplines. The literature strongly points towards interdisciplinary research as the future

direction research will need to take in order to examine the complexity of differential

attainment.

In this review we have discussed studies that have used a number of different research

methods. The majority of studies included in the synthesis have used quantitative analysis

(n = 25), but often recommended additional qualitative research. For example using

qualitative analysis to drill down into Woolf’s findings (20) would be helpful in identifying

where tailored support may be efficacious.

With the recognition that ‘causes’ are complex it becomes appropriate to use qualitative

research approaches; traditionally favoured when the main research objective is to improve

our understanding of a complex phenomenon embedded in its context. Qualitative research

is expensive and labour intensive; so much of it is small scale and local in nature.

Drawing on a wider qualitative literature in areas like ‘diversity’ could provide

methodological and empirical insights. We have not identified any studies that draw on a

wider literature, for example medical education could learn much from the wider literature

on learning styles, which could inform the current significant gap in knowledge. The wider

educational literature on mentoring, for example, and other educational interventions

would be a way of informing potential interventions.

A significant issue across the research is the lack of transparency around the categories of

explanation. The category BME, for example, needs interrogating. Denny’s research

critiques the conflation of BME into one category in research, suggesting qualitative

research is required in order to “understand the detailed genesis of performance

differences.” (11)

Ethnic and ethnicity are similarly ambiguous terms: you can’t develop interventions to

address the differential attainment ‘problem’ unless you first recognise the ‘problem’ of

definition and classification within BME and IMG across medical school and Royal College

data and research.

48

Differential attainment is of course not confined to ethnicity, but this is the theme that has

dominated the research. We found limited research into differential attainment in relation

to gender and some in relation to age, the latter as an additional demographic category

rather than the focus of any studies. That women out perform men in undergraduate

education and continue to do so in certain specialisms at postgraduate level seems to be a

given. We found no studies examining the contributors to the success of these female

students aside from their gender of itself as a predictor.

10.3 Possible interventions The review found only two examples evaluating specific interventions tailored for

postgraduate study. Stamm’s longitudinal quantitative study of postgraduate Swiss medical

students found that mentoring had a positive impact. (37) The study was limited by relying

on self-reported data and the limited age range of participants. The second study by Plint

discusses the GP recruitment process as an example of best practice selection process. (6)

As researchers in undergraduate selection have long recognised, there is a correlation

between selection and subsequent performance, but there is little research in the

postgraduate context. Anecdotally we know that medical schools and employing

organisations implement interventions to support students, doctors, trainers and the

workforce as a whole. However we found scant evidence of this in the literature. Individual

medical schools are probably doing excellent work but this is likely to remain un-

disseminated in the academic literature, because it is related to teaching practice and will

not be available for public access through the internet. This is a key finding in its own right,

raising questions about how this information can be found and assessed for impact.

The studies show that there are a variety of issues that affect the performance and

attainment of students. These include issues around the background and characteristics of

the individuals, the stage they are at in their medical career and the medical school or

workplace environment. These might have cumulative effects over time or ‘one-off’ effects

at certain stages of their career.

We found very few studies examining interventions, and those that we found are not robust

enough to demonstrate clear evidence of a definitive intervention aside from mentoring.

49

However, we can identify some ‘trends’ in the literature that can inform the ‘look’ of the

overall structure of an intervention to improve student performance and/or attainment.

Due to the variety of identified factors affecting performance and attainment, it is more

likely that a complex intervention needs to be designed rather than a simple intervention in

order to address the complexity of differential attainment. The first question in designing an

intervention would relate to the level at which the intervention is required: at the individual

level, the institutional level, a broader policy level, or a complex intervention with

components on each level.

It is important to consider any intervention in the light of all levels. For example raising the

PLAB 1 and 2 pass marks (policy level), while it would provide an equivalence of

performance with UK medical graduates (23) and candidates for the MRCP(UK) and MRCGP

(7) it would reduce attainment with workforce implications.

An example of an intervention initiated at the policy level but operationalised at the

institutional level to benefit individually identified learning needs was the proposed

appointment of local and national IMG Champions and improved examiner training to help

recognise unconscious biases (accents, manner etc). (53) This recommendation was

developed through a conference workshop at the 2014 RCPsych conference devoted to

IMGs.

The literature identifies that much of the support needed for IMGs, at least initially, is

practical. One key area identified is the information available prior to arrival, appropriate

induction and ongoing support. The undergraduate literature is well developed in this area

but there should not be an uncritical translation of undergraduate processes to the

postgraduate context since the postgraduate experience is much less structured and the

curriculum more fragmented.

Development of intercultural competence is deemed essential for successful

communication. Patterson identifies one of the most challenging aspects being the “ability

to distinguish between idiosyncratic and culturally conditioned behaviours.” (4) The RCPsych

conference is an example of a targeted event (policy level) to examine what support IMGs

need, but there are doubtless other targeted events and interventions at the institutional

50

level as part of internal diversity initiatives. However there is no pool of best practice to help

develop networks.

The identification in the literature of the complexities of differential attainment strongly

suggests that students need to be supported on an individual level, and this may require

significant changes at the institutional level, including greater flexibility in the curriculum,

the acknowledgement that some students will take longer than others to reach a stage

where they can be confident to pass a given exam, rather than making multiple attempts.

11 Conclusions (the story so far) This rapid review, conducted over three and a half months has identified the multifactorial

nature of differential attainment in postgraduate medical education in the UK.

The review has identified a narrative of research interests and methods that have developed

through the quantitative analysis of exam data, with the aim of locating bias, towards a

more nuanced research approach looking at both educational and social factors as potential

contributors (rather than single causes) to differential attainment. The development of the

research over time shows us that researchers need to understand the micro, meso and the

macro-structure of medical education in order to understand differential attainment. The

literature strongly points towards interdisciplinary research as a future direction research

will need to take in order to examine the complexity of differential attainment.

Interventions will have cost implications, yet we found no examples of cost benefit analysis

in the literature. This is an important omission and should be structured into any evaluation

of an intervention along with a rigorous analysis of impact.

The literature clearly identifies the growing importance of IMGs internationally to the

delivery of healthcare and the increasing globalization of the medical workforce. Rushd

suggests that the introduction of e-learning modules, for trainers and trainees may go

towards developing parity across geographic regions. (46) While Roberts discusses the need

for a debate about tolerances and communicative flexibility, about what are acceptable

competencies.(3) Certainly a better understanding of the challenges of transition faced by

individuals entering the UK workplace could inform interventions at individual, institutional

and policy levels.

51

IMGs are part of an implied movement towards ‘global standards’ for medical exams as

medicine becomes internationalized. This is potentially the next chapter in the story and

ensures that differential attainment will not only remain firmly on the agenda but

potentially become central to this emerging discourse and a driver for change.

52

12 References 1. Popay J, H Roberts, A Sowden, M Petticrew, L Arai, Rodgers M. Guidelines on the conduct of narrative synthesis in systematic reviews. 2006. 2. Illing J. The experiences of UK, EU and non-EU medical graduates in the transition to the UK workplace. 2009. 3. Roberts Celia, Atkins Sarah, Hawthorne Kamila. Performance features in clinical skills assessment: Linguistic and cultural factors in the Membership of the Royal College of General Practitioners examination. London: Centre for Language, Discourse & Communication, Kings College London; 2014. 4. Patterson Fiona, La-Band Analise, Koczwara Anna, Spicer John. GP National Selection Process: Equalities Impact. 2012. 5. Woolf K, Potts HWW, McManus IC. Ethnicity and academic performance in UK trained doctors and medical students: systematic review and meta-analysis. BMJ (Clinical Research Ed). 2011;342:d901-d. 6. Plint S, Patterson F. Identifying critical success factors for designing selection processes into postgraduate specialty training: the case of UK general practice. Postgraduate medical journal. 2010;86(1016):323-7. 7. McManus I C, Wakeford R. PLAB and UK graduates' performance on MRCP(UK) and MRCGP examinations: data linkage study. BMJ. 2014;348:g2612. 8. McManus C, Elder A, Dacre J. Investigating possible ethnicity and sex bias in clinical examiners: An analysis of data from the MRCP(UK) PACES and nPACES examinations. BMC Medical education. 2013;13. 9. Dewhurst Neil, McManus C, Mollon J, Dacew J, Vale A. Performance in the MRCP(UK) Examination 2003-4: analysis of pass rates of UK graduates in relation to self-declared ethnicity and gender. BMC Medicine. 2007;5(8). 10. Esmail A, Roberts C. Academic performance of ethnic minority candidates and discrimination in the MRCGP examinations between 2010 and 2012: analysis of data. BMJ (Clinical Research Ed). 2013;347:f5662-f. 11. Denney ML, Freeman A, Wakeford R. MRCGP CSA: are the examiners biased, favouring their own by sex, ethnicity, and degree source? The British journal of general practice : the journal of the Royal College of General Practitioners. 2013;63(616):e718-25. 12. Hawtin KE, Williams HR, McKnight L, Booth TC. Performance in the FRCR (UK) Part 2B examination: analysis of factors associated with success. Clinical radiology. 2014;69(7):750-7. 13. Rimmer A. Royal colleges must improve data on diversity of exam candidates BMA says. BMJ careers. 23 January 2014. 14. Patterson Fiona, Denney Mei-Ling, Wakeford R, Good D. Fair and equal assesmnet in postgraduate training? British Journal of General practice. 2011:712-3. 15. Knight RA. Reasons why doctors who perform well as doctors may fail the MRCGP clinical skills assessment exam. BMJ (Clinical Research Ed). 2013;347:f6438-f. 16. Huijskens EG, Hooshiaran A, Scherpbier A, van der Horst F. Barriers and facilitating factors in the professional careers of international medical graduates. Med Educ. 2010;44(8):795-804. 17. Farrokhi-Khajeh-Pasha Y, Nedjat S, Mohammadi A, Malakan Rad E, Majdzadeh R. Informed choice of entering medical school and academic success in Iranian medical students. Medical Teacher. 2014;36(11):978-82.

53

18. Frei E, Stamm M, Buddeberg-Fischer B. Mentoring programs for medical students - a review of the PubMed literature 2000 - 2008. BMC Medical Education. 2010;10(1):32. 19. Vaughan S, Sanders T, Crossley N, O'Neill P, Wass V. Bridging the gap: the roles of social capital and ethnicity in medical student achievement. Medical Education. 2015;49(1):114-23. 20. Woolf K, McManus IC, Potts HWW, Dacre J. The mediators of minority ethnic underperformance in final medical school examinations. The British Journal Of Educational Psychology. 2013;83(Pt 1):135-59. 21. Abdulghani HM, Al-Drees AA, Khalil MS, Ahmad F, Ponnamperuma GG, Amin Z. What factors determine academic achievement in high achieving undergraduate medical students? A qualitative study. Medical Teacher. 2014;36 Suppl 1:S43-S8. 22. Thomas B, Manusov EG, Wang A, Livingston H. Contributors of black men's success in admission to and graduation from medical school. Academic Medicine: Journal Of The Association Of American Medical Colleges. 2011;86(7):892-900. 23. Tiffin PA, Illing J, Kasim AS, McLachlan JC. Annual Review of Competence Progression (ARCP) performance of doctors who passed Professional and Linguistic Assessments Board (PLAB) tests compared with UK medical graduates: National data linkage study. BMJ: British Medical Journal. 2014;348. 24. Wakeford R, Denney M, Ludka-Stempien K, Dacre J, McManus I C. Cross-comparison of MRCGP & MRCP(UK) in a database linkage study of 2,284 candidates taking both examinations: assessment of validity amd differential performance by ethnicity. BMC Medical education. 2015;15:1. 25. Bowhay AR, Watmough SD. An evaluation of the performance in the UK Royal College of Anaesthetists primary examination by UK medical school and gender. BMC Med Educ. 2009;9:38. 26. Woolf K, Cave J, Greenhalgh T, Dacre J. Ethnic stereotypes and the underachievement of UK medical students from ethnic minorities: qualitative study2008 2008-08-18 09:33:44. 27. HEFCE. Student ethnicity: experiences in full-time, first degree study http://www.hefce.ac.uk/media/hefce1/pubs/hefce/2010/1013/10_13.pdf 2010. 28. Esmail A, Roberts C. Independent Review of the Membership of the Royal College of General Practitioners (MRCGP) examination. General medical Council, 2013. 29. General Medical Council. Interactive reports to investigate factors that affect progression of doctors in training. http://wwwgmc-ukorg/education/25495asp. March 2015. 30. Woolf K, Potts HWW, McManus IC. Ethnicity and academic performance in UK trained doctors and medical students: systematic review and meta-analysis2011 2011-03-08 23:34:46. 31. Haq I, Higham J, Morris R, Dacre J. Effect of ethnicity and gender on performance in undergraduate medical examinations. Med Educ. 2005;39:1126 - 28. 32. Lavis J HD, A Oxman, J Denis, K Golden-Biddle, E Ferlie. Towards systematic reviews that inform health care management and policy making. Health Services Research & policy. 2005;10(1):35-48. 33. Oliver S, Harden A, Rees R, Shepherd J, Brunton G, Garcia J, et al. An Emerging Framework for Including Different Types of Evidence in Systematic Reviews for Public Policy. Evaluation. 2005;11(4):428-46. 34. Ganann R, D Ciliska, Thomas H. Expediating systematic reviews: Methods and implications of rapid reviews. Implementation Science. 2010;5(56).

54

35. Watt A, Cameron A, Sturm L, Lathlean T, Babidge W, Blamey S, et al. Rapid reviews versus full systematic reviews: An inventory of current methods and practice in health technology assessment. International Journal of Technology Assessment in Health Care. 2008;24(02):133-9. 36. Johannessen K-A, Hagen TP. Individual and hospital-specific factors influencing medical graduates' time to medical specialization. Social Science & Medicine (1982). 2013;97:170-5. 37. Stamm M, Buddeberg‐Fischer B. The impact of mentoring during postgraduate training on doctors’ career success. Medical Education. 2011;45(5):488-96. 38. Academy of Medical Royal Colleges. Academy of Medical Royal Colleges Review 2013-2014. 2014. 39. Schrewe B, Frost H. Finding potential in balance: navigating the competing discourses of diversity and standardization. Academic Medicine: Journal Of The Association Of American Medical Colleges. 2012;87(11):1479-. 40. Marton F, Säljö R. On qualitative differences in learning: I - Outcome and Process,. British Journal of Educational Psychology. 1976;46:4-11. 41. Biggs J. Teaching for Quality Learning at University: SHRE and Open University Press.; 1999. 42. Moore J, Sanders J, Higham L. Literature review of research into widening participation to higher education: Report to HEFCE and OFFA by ARC Network. London: 2013. 43. Costa P T, McCrae R R. Revised NEO personality inventory (NEO-PI-R) and NEO five-factor inventory (NEO-FFI) professional manual. Odessa: Psychological Assessment Resources; 1992. 44. Feedback from the HEA ECU and HEFCE sponsored summit, editor Supporting black and minority ethnic student success in higher education - narrowing the gap

2012. 45. Singh G. A Synthesis of Research Evidence. Black and minority ethnic (BME) students' participation in higher education: Improving retention and success. 2012. 46. Rushd S, Landau A B, Lindow S W. An evaluation of the first time performance of international medical graduates in the MRCOG Part 1 amd Part 2 written examinations. European Journal of Obstetrics & Gynecology and Reproductive Biology. 2013;166:124-6. 47. McManus IC, Woolf K, Dacre J. The educational background and qualifications of UK medical students from ethnic minorities. BMC Med Educ. 2008;8:21. 48. Dillner L. Manchester tackles failure rate of Asian students. BMJ. 1995;310:209. 49. Burgess DJ, Joseph A, van Ryn M, Carnes M. Does stereotype threat affect women in academic medicine? Academic Medicine: Journal Of The Association Of American Medical Colleges. 2012;87(4):506-12. 50. Sandhu DP. Current dilemmas in overseas doctors' training. Postgraduate medical journal. 2005;81(952):79-82. 51. Jolly P, Boulet J, Garrison G, Signer MM. Participation in U.S. graduate medical education by graduates of international medical schools. Academic medicine : journal of the Association of American Medical Colleges. 2011;86(5):559-64. 52. General Medical Council. Welcome to Practice. 2014. 53. Al-Taiar Hassan, Menzies Alexandra. Report on the RCPsych IMG Conference 2014: Royal College of Psychiatrists

55

2014. 54. Rimmer A. RCP is to highlight gap in performance between overseas doctors and UK graduates. BMJ Careers. 02 December 2014. 55. Bibbo C, Bustamante A, Wang L, Friedman F, Jr., Chen KT. Toward a Better Understanding of Gender-Based Performance in the Obstetrics and Gynecology Clerkship: Women Outscore Men on the NBME Subject Examination at One Medical School. Academic Medicine: Journal Of The Association Of American Medical Colleges. 2014. 56. Cuddy Monica, Swanson David, Clauser Brian. A Multilevel Analysis of the Relationships between Examinee Gender and United States Medical Licensing Exam (USMLE) Step 2 CK Content Area Performance. Academic Medicine. 2007;82(10):589-93. 57. Royal College of Obstetricians and Gynaecologists. Mentoring for all: RCOG press; 2005. 58. Department of Health. Mentoring for Doctors: Signposts to current practice for career grade doctors. Guidance from the Doctors' Forum. London: DOH; 2004. 59. Stamm M, Buddeberg-Fischer B. The impact of mentoring during postgraduate training on doctors’ career success. Medical Education. 2011;45(5):488-96. 60. Steven A, Oxley J, Fleming W G. Mentoring for NHS doctors: perceived benefits across the personal-professional interface. Journal of teh Royal Society of Medicine. 2008;101:552-7. 61. Woloschuk W, McLaughlin K, Wright B. Is undergraduate performance predictive of postgraduate performance? Teach Learn Med. 2010;22(3):202-4. 62. Watmough S, Bowhay A. An evaluation of the impact of country of primary medical qualification on performance in the UK Royal College odf Anaesthetists' examinations. Medical Teacher. 2011;33:938-40. 63. Memon M, Joughin G, Memon B. Oral assessment and postgraduate medical examinations: establishing conditions for validity, reliability and fairness. Advances in Health Sciences Education. 2010;15(2):277-89. 64. Hawkins Judith. Ethnic minority trainee GPs are suffering racial discrimination, claims Manchester uni study. Mancunian Matters. 28 September 2013. 65. Differing pass rates raise concerns about MRCGP exam. BMA. 24 May 2013. 66. Low ethnic minority exam pass rates sparks call for research. BMA. 24 June 2013. 67. Action needed to end college exam disparity. BMA. 11 April 2014. 68. Duffin Christian. RCGP will ensure examiners are 'representative of race and ethnicity'. PULSE. 21 May 2014. 69. Shaw Q. High failure rate of ethnic minority groups in MRCGP exam comes from changes to exam and candidate selection. BMJ (Clinical Research Ed). 2013;347:f6442-f. 70. British Medical Association. Examining Quality: A survey of royal college examinations. Progress review. Equality nad Diversity Committee, 2014. 71. Academy of Medical Royal Colleges. Fairness, equality and medical royal college exams: Academy of medical royal colleges statement of principles. 2014.

56

13 Appendices Appendix 1: Studies and other documents included in the synthesis

Reference / Data Collection (quantitative or qualitative?)

Population & Setting (Context)

Perspective (the objective and standpoint of the study)

Intervention or Test used to evaluate the participants

Outcome (findings) Conclusion Quality assessment of study and Generalizability

Abdulghani HM, Al-Drees AA, Khalil MS, Ahmad F, Ponnamperuma GG, Amin Z. What factors determine academic achievement in high achieving undergraduate medical students? A qualitative study. Medical Teacher. 2014;36 Suppl 1:S43-S8.

10 male and 9 female high achieving (scores more than 85% in all tests) students, from the second, third, fourth and fifth academic years.

The aim of this study is to explore the high achieving students’ perceptions of factors contributing to academic achievement.

Qualitative study using focus group discussions

Factors influencing high academic achievement include: prioritization of learning time management, and family support. Management of non-academic is also important.

Addressing these factors, which might be unique for a given student community, in a systematic manner would be helpful to improve students’ performance.

High quality study Generalizability = indirect

Academy of Medical Royal Colleges. Academy of Medical Royal Colleges Review 2013-2014. 2014.

n/a Academy of Medical Royal Colleges

n/a Paragraph in the review about differential attainment

AoMRC held a seminar and are co-ordinating work to take initiatives suggested forward.

n/a

57

Reference / Data Collection (quantitative or qualitative?)

Population & Setting (Context)

Perspective (the objective and standpoint of the study)

Intervention or Test used to evaluate the participants

Outcome (findings) Conclusion Quality assessment of study and Generalizability

Academy of Medical Royal Colleges. Fairness, equality and medical royal college exams: Academy of medical royal colleges statement of principles. 2014.

Royal College examinations

Statement of principles n/a 7 principles identified MRCGP exam is not the reason for differential attainment. Complex and varied factors lead to differential attainment and are not unique to medicine. Colleges must have no factors in their control that contribute to differential attainment

n/a

Al-Taiar Hassan, Menzies Alexandra. Rport on the RCPsych IMG Conference 2014. London: Royal College of Psychiatrists 2014.

IMGs taking the MRCPsych

Conference addressing the difficulties encountered by IMGs. Including: induction, training, supervision and feedback.

n/a A number of papers and workshops

Trainers and employers have key roles in supporting IMGs

n/a

BAPIO. Special Edition BAPIO conference,. Sushruta. 2014;7(1).

BAPIO Special conference edition of Sushruta following the Judicial Review.

n/a Endorsements from high level policy makers and politicians about BME doctors contribution to the NHS

Although not successful BAPIO describe the judicial review as a ‘moral victory’.

n/a

58

Reference / Data Collection (quantitative or qualitative?)

Population & Setting (Context)

Perspective (the objective and standpoint of the study)

Intervention or Test used to evaluate the participants

Outcome (findings) Conclusion Quality assessment of study and Generalizability

Bibbo C, Bustamante A, Wang L, et al. Toward a Better Understanding of Gender-Based Performance in the Obstetrics and Gynaecology Clerkship: Women Outscore Men on the NBME Subject Examination at One Medical School. Academic Medicine: Journal Of The Association Of American Medical Colleges 2014.

Retrospective cohort study of students with Ob/Gun rotation (2008-2011), the Icahn School of Medicine at Mount Sinai in New York City.

To better understand why women outperform men in the Ob/Gyn clerkship.

Comparison of female and male students’ performance on MCAT, USMLE and Ob/Gyn clerkship components.

Women who took the MCAT scored lower than men. Similarly, in the USMLE - women scored lower than men. In the Ob/.Gyn clerkship – most components showed no significant gender differences. But, women outscored men on the NBME subject examination in Ob/Gyn and so outperformed men in the Ob/Gyn clerkship.

Interest in Ob/.Gyn is declining, evidenced by decrease in U.S. medical school graduates entering residency programs (8% in 1993 to 4% in 2013). In addition, the aging Ob/Gyn workforce has a high level of career dissatisfaction, which leads to early retirement and decreased work hours.

Unclear quality study Generalizability = indirect

British Medical Association. Examining Quality: A survey of royal college examinations. Progress review. Equality nad Diversity Committee, 2014.

BMA members BMA Equality and Diversity Committee. Monitoring for protected characteristics by medical schools

Review of the monitoring of speciality examinations by a letter requesting information from the Colleges. All 18responded.

Continuing variable processes and procedures for monitoring speciality examinations with respect to equality and diversity.

Concerns that insufficient attention is being paid to ensuring these examinations are not affected by unfair discrimination or bias. Further research needed beyond the assessment process.

n/a

59

Reference / Data Collection (quantitative or qualitative?)

Population & Setting (Context)

Perspective (the objective and standpoint of the study)

Intervention or Test used to evaluate the participants

Outcome (findings) Conclusion Quality assessment of study and Generalizability

Bowhay AR, Watmough SD. An evaluation of the performance in the UK Royal College of Anaesthetists primary examination by UK medical school and gender. BMC Med Educ. 2009;9:38.

UK medical graduates in the postgraduate examinations in anaesthesia, which is the largest hospital based speciality in UK medicine.

The impact that changes to undergraduate curricula might have on postgraduate academic performance.

Data from each sitting of the MCQ section of the primary FRCA examination from June 1999 to May 2008 were analysed for performance by medical school and gender.

4983 attempts at the MCQ part of the FRCA examination by 3303 graduates from the 19 UK medical schools. The pass marks of graduates from five schools performed significantly better than the mean for the group. Males performed significantly better than females in all aspects of the MCQ. Graduates from three medical schools that have undergone the change from Traditional to a PBL curricula did not show any change in performance in any aspects of the MCQ pre and post curriculum change.

Graduates from each of the medical schools in the UK show differences in performance in the MCQ section of the primary FRCA, but significant curriculum change does not lead to deterioration in performance. Whilst females now outnumber males taking the MCQ, they are not performing as well as the males.

High quality study Generalizability = direct

60

Reference / Data Collection (quantitative or qualitative?)

Population & Setting (Context)

Perspective (the objective and standpoint of the study)

Intervention or Test used to evaluate the participants

Outcome (findings) Conclusion Quality assessment of study and Generalizability

Cuddy Monica, Swanson David, Clauser Brian. A Multilevel Analysis of the Relationships between Examinee Gender and United States Medical Licensing Exam (USMLE) Step 2 CK Content Area Performance. Academic Medicine. 2007;82(10):589-93.

23,538 examinees from 136 Liaison Committee on Medical Education–accredited medical schools/ campuses.

examine the effect of gender on Step 2 CK content area performance, on the relationships between Step 1 scores and Step 2 CK content area performance, and medical school characteristics on the relationships between examinee characteristics and Step 2 CK content area performance.

Descriptive statistics were computed, and a series of examinees nested-in-schools hierarchical linear models were conducted.

Observed differences indicated that women generally outperformed men in most content areas. School characteristics were generally unrelated to the relationships between examinee characteristics and Step 2 CK content area performance.

While past research indicated that women outperformed men in some content areas, and men outperformed women in others, the current study revealed a somewhat different pattern, with women outperforming men in most content areas.

High quality study Generalizability = indirect

Davis Joe. RCGP instigates wide-ranging review into diversity policies following CSA court case. Pulse. 19 March 2015.

RCGP. Post judicial review

Medical news media RCGP conducting wholesale review into its equality and diversity policies following judicial review

RCGP will be exploring effective ways to collect ‘characteristic data’ on trainees and examiners

n/a n/a

Denney ML, Freeman A, Wakeford R. MRCGP CSA: are the examiners biased, favouring their own by sex, ethnicity, and degree source? Br J Gen Pract 2013;63(616):e718-25.

Data on 4000 candidates (52 000 cases) sitting the MRCGP clinical skills assessment in 2011–2012.

An investigation of candidates’ case performances by candidates’ and examiners’ demographics.

Univariate analyses were undertaken of subgroup performance (male/female, white/black and BME, UK/non-UK graduates) by parallel examiner demographics.

Univariate analysis showed some differences between outcomes between the same-group and other-group examiners: these were contradictory regarding examiners ‘favouring their own’.

Concern exists regarding differential performance of candidates in postgraduate clinical assessments by ethnicity, sex, and country of primary qualification.

High quality study Generalizability = direct

61

Reference / Data Collection (quantitative or qualitative?)

Population & Setting (Context)

Perspective (the objective and standpoint of the study)

Intervention or Test used to evaluate the participants

Outcome (findings) Conclusion Quality assessment of study and Generalizability

Dewhurst Neil, McManus C, Mollon J, Dacew J, Vale A. Performance in the MRCP(UK) Examination 2003-4: analysis of pass rates of UK graduates in relation to self-declared ethnicity and gender. BMC Medicine. 2007;5(8).

UK medical graduates sitting the MRCP (UK) examination in 2003-4.

Reported underperformance of students from ethnic minorities in undergraduate examinations

Pass rates for each part of the MRCP(UK)] Examination in 2003–4 were analysed for differences between graduate groupings based on self-declared ethnicity and gender. All candidates declared their gender, and 84–90% declared their ethnicity.

In all three parts of the examination, white candidates performed better than other ethnic groups (P < 0.001). Analysis of overall average marks showed no interaction between candidate gender and the number of assessments made by female examiners (P = 0.151).

The cause of these differences is most likely to be multifactorial. Potential examiner prejudice, significant only in the cases where there were two non-white examiners and the candidate was non-white, might indicate different cultural interpretations of the judgements being made.

High quality study Generalizability = direct

Duffin Christian. RCGP will ensure examiners are 'representative of race and ethnicity'. PULSE. 21 May 2014.

IMG and BME trainees Medical news media n/a BAPIO and the British International Doctors Association jointly developing initiatives to help IMG and BME trainees

n/a

Esmail A, Roberts C. Independent Review of the Membership of the Royal College of General Practitioners (MRCGP) examination. General medical Council, 2013.

CSA candidates October 2010 – November 2012

Understanding the difference in pass rates between IMG and BME candidates from UK and white graduates sitting the MRCGP examination

Independent quantitative review. All CSA sittings from October 2010 – November 2012

Significant differences in outcomes between IMG, BME, white and UK graduates in both the AKT and CSA components of the MRCGP.

Candidates need to be provided with better information from the GMC and the Deaneries. Training for educational supervisors and trainers is required. More research needed on factors for success.

High quality study Generalizability = direct

62

Reference / Data Collection (quantitative or qualitative?)

Population & Setting (Context)

Perspective (the objective and standpoint of the study)

Intervention or Test used to evaluate the participants

Outcome (findings) Conclusion Quality assessment of study and Generalizability

Esmail A, Roberts C. Academic performance of ethnic minority candidates and discrimination in the MRCGP examinations between 2010 and 2012: analysis of data. BMJ (Clinical Research Ed) 2013;347:f5662-f62.

Cohort of 5095 candidates sitting the applied knowledge test and clinical skills assessment components of the MRCGP examination between November 2010 and November 2012.

To determine the difference in failure rates in the postgraduate examination of the Royal College of General Practitioners (MRCGP) by ethnic or national background, and to identify factors associated with pass rates in the clinical skills assessment component of the examination.

Analysis of data provided by the RCGP and the GMC. A further analysis was carried out on 1175 candidates not trained in the United Kingdom, who sat IELTS test and PLAB examination , controlling for scores on these examinations and relating them to pass rates of the clinical skills assessment.

After controlling for age, sex, and performance in the applied knowledge test, significant differences persisted between white UK graduates and other candidate groups. Black and minority ethnic (BME) graduates trained in the UK or who trained abroad were more likely to fail the clinical skills assessment than white UK candidates

Consideration should be given to strengthening postgraduate training for international medical graduates.

High quality study Generalizability = direct

63

Reference / Data Collection (quantitative or qualitative?)

Population & Setting (Context)

Perspective (the objective and standpoint of the study)

Intervention or Test used to evaluate the participants

Outcome (findings) Conclusion Quality assessment of study and Generalizability

Farrokhi-Khajeh-Pasha Y, Nedjat S, Mohammadi A, Malakan Rad E, Majdzadeh R. Informed choice of entering medical school and academic success in Iranian medical students. Medical Teacher. 2014;36(11):978-82.

220 final-year medical students randomly selected from six Iranian medical schools.

Compared students who made an informed choice about entering medicine with those who did not, in terms of academic success.

Self-administered questionnaire.

Students who had not made an informed choice had a higher tendency not to choose medicine if they were to start over (p value _0.001). The pre-admission scores of students who had made an informed choice of medicine were worse than the other group (p¼0.03). However, their final year scores as well as their satisfaction with medicine were higher than the other group.

Idealistic views of medicine should be replaced by rational and logical ones to help students select the careers best suited to their abilities and talents.

Low quality study Generalizability = indirect

64

Reference / Data Collection (quantitative or qualitative?)

Population & Setting (Context)

Perspective (the objective and standpoint of the study)

Intervention or Test used to evaluate the participants

Outcome (findings) Conclusion Quality assessment of study and Generalizability

Frei E, Stamm M, Buddeberg-Fischer B. Mentoring programs for medical students - a review of the PubMed literature 2000 - 2008. BMC Medical Education. 2010;10(1):32.

Doctors and undergraduate medical students

Types of structured mentoring programmes that exist for doctors as well as for medical students,

A literature-search strategy was applied to Medline for 1966–2002 using keyword combinations.

A total of 162 publications were identified. 16 (9 for medical students and 7 for doctors) were included for review. The majority of the programmes lack a concrete structure as well as a short- and long-term evaluation. Main goals included increasing professional competence and to build up a professional network for the mentees

Although the results of mentoring are promising, more formal programmes with clear setup goals and a short- and long-term evaluation of the individual successes of the participants as well as the cost-benefit analysis are needed.

n/a

General Medical Council. Interactive reports to investigate factors that affect progression of doctors in training. 2015. http://www.gmc-uk.org/education/25495.asp

For those responsible for managing and delivering education and training in the UK

Initial findings may help to further identify effective mechanisms to support graduates through training pathways

Reports cover one year of exam outcomes and 3 years of round one recruitment data

Reports show exam pass rates between medical school and postgraduate training programmes in addition to some demographics.

n/a

65

Reference / Data Collection (quantitative or qualitative?)

Population & Setting (Context)

Perspective (the objective and standpoint of the study)

Intervention or Test used to evaluate the participants

Outcome (findings) Conclusion Quality assessment of study and Generalizability

Hawkins Judith. Ethnic minority trainee GPs are suffering racial discrimination, claims Manchester uni study. Mancunian Matters. 28 September 2013.

Refers to Esmail and Roberts report.

Local news media n/a Strongly suggest institutional racism

n/a

Hawtin KE, Williams HR, McKnight L, et al. Performance in the FRCR (UK) Part 2B examination: analysis of factors associated with success. Clin Radiol 2014;69(7):750-7.

FRCR 2B candidates 2006-10

To assess factors that influence pass rates and examination scores in the Fellowship of the Royal College of Radiologists (FRCR) 2B examination. This is a high stakes examination. The final component of the FRCR examination, Part 2B (FRCR 2B) involves both oral and written assessments and has remained fundamentally un- changed for more than a decade.

2238 examination were evaluated between Spring 2006 and Spring 2010. Pass rates and examination scores were analysed by gender, ethnicity, and the influence of factors such as radiology training (UK versus non-UK), sitting (Spring versus Autumn), and the presence of an undergraduate or postgraduate degree.

UK candidates were significantly more likely to pass than non-UK candidates. White candidates were more likely to pass at 1st or 2nd attempt than non-white candidates, but when restricted to UK entrants ethnicity did not influence success at 1st attempt. Overall, females were more successful than males. Having an undergraduate or post-graduate degree did not affect pass rate at first attempt for UK candidates.

The FRCR 2B examination is non-discriminatory for UK candidates with respect to gender and ethnicity. Poorer performance of non-UK trained candidates is a consistent outcome in the literature.

High quality study Generalizability = direct

66

Reference / Data Collection (quantitative or qualitative?)

Population & Setting (Context)

Perspective (the objective and standpoint of the study)

Intervention or Test used to evaluate the participants

Outcome (findings) Conclusion Quality assessment of study and Generalizability

Huijskens EG, Hooshiaran A, Scherpbier A, van der Horst F. Barriers and facilitating factors in the professional careers of international medical graduates. Med Educ. 2010;44(8):795-804.

32 IMGs who entered the Netherlands as refugees or as spouses of Dutch citizens. As their non-European medical qualifications are not considered equivalent to the Dutch qualifications, they are required to undertake additional medical training.

Social and cultural diversity are increasingly important characteristics of the medical professional workforce. Every year, substantial numbers of IMGs seek jobs outside the countries in which they were educated.

Qualitative research using in-depth interviews

Reported barriers included difficulties in accessing information on and lack of (financial) support. Perseverance was reported to be essential. Lack of command of the Dutch language and age were seen as barriers to securing employment and entrance to specialisation.

Barriers identified have major implications for IMGs wishing to practise medicine in the Netherlands. Better support to overcome the difficulties inherent in migration and career change will result in better trained and acculturated doctors who will be more motivated to contribute to society.

High quality study Generalizability = indirect

Illing J. The experiences of UK, EU and non-EU medical graduates in the transition to the UK workplace. 2009.

Doctors with a PMQ gained outside the UK

Challenges faced by doctors with a PMQ gained outside the UK

Qualitative and quantitative data

Overseas qualified doctors identified differences in practice, structural elements of healthcare and knowledge gaps as well as difficulties outside the workplace.

GMC may play a central role in developing a joined up approach to the support of overseas qualified doctors

High quality study Generalizability = direct

67

Reference / Data Collection (quantitative or qualitative?)

Population & Setting (Context)

Perspective (the objective and standpoint of the study)

Intervention or Test used to evaluate the participants

Outcome (findings) Conclusion Quality assessment of study and Generalizability

Johannessen K-A, Hagen TP. Individual and hospital-specific factors influencing medical graduates' time to medical specialization. Social Science & Medicine (1982) 2013;97:170-75.

All 2474 Norwegian residents who began specialization in 1999-2001

Gender differences in relation to medical specialization have focused more on social variables than hospital-specific factors.

A multivariate analysis with extended Cox regression, using register data for socio-demographic variables together with hospital-specific variables to study the concurrent effect of these variables on specialty qualification

Multivariate analysis showed that the smaller proportion of women who qualified for a specialty was explained principally by childbirth and by the number of children aged under 18 years.

Hospital factors were significant predictors for the timely attainment of specialization:.

High quality study Generalizability = indirect

Jolly P, Boulet J, Garrison G, et al. Participation in U.S. graduate medical education by graduates of international medical schools. Acad Med 2011;86(5):559-64.

IMGs are an important part of U.S. graduate medical education and medical practice. They make up a significant number of the participants in both the ERAS) and NRMP.

The multiple pathways used by IMGs in pursuit of a U.S. residency position.

Descriptive study of 10,328 IMGs certified by the ECFMG between July 1, 2005 and June 30, 2006. Linked data study on this cohort determined the numbers of members of the study cohort who participated in ERAS and/or the NRMP in 2003 through 2009, and who found a residency appointment.

The IMGs in the study cohort began applying for residencies the year immediately following ECFMG certification, but almost half were unsuccessful in their first attempts. Three-quarters of the members of the cohort had begun a residency by 2010.

Although they face significant hurdles in achieving their goal, the majority of those who persist are ultimately successful. If enrolments and graduations of U.S. MD- and DO-granting medical schools continue to rise, IMGs’ difficulty in finding residencies is sure to increase.

High quality study Generalizability = indirect

68

Reference / Data Collection (quantitative or qualitative?)

Population & Setting (Context)

Perspective (the objective and standpoint of the study)

Intervention or Test used to evaluate the participants

Outcome (findings) Conclusion Quality assessment of study and Generalizability

Knight RA. Reasons why doctors who perform well as doctors may fail the MRCGP clinical skills assessment exam. BMJ (Clinical Research Ed) 2013;347:f6438-f38.

IMGs Questions why IMGs fail an examination in a simulated environment when they can perform in the real environment.

Letter n/a Provides eight points that outline possible reasons why ethnicity, training experience, and sex can disadvantage certain candidate groups in a high stakes simulated environment.

n/a

McManus IC, Woolf K, Dacre J. The educational background and qualifications of UK medical students from ethnic minorities. BMC Med Educ 2008;8:21.

White and non-white students entering medical school

UK medical students and doctors from ethnic minorities underperform in undergraduate and postgraduate exams. Research examines the assumption that white and nonwhite students enter medical school with similar qualifications.

Attainment at GCSE and A level, and selection for medical school in relation to ethnicity, were analysed in two separate databases. - The 10th cohort of the Youth Cohort Study (GCSEs and A level) and UCAS for medical school entry data.

NW students have higher educational aspirations, being more likely to go on to take A levels, especially in science and particularly chemistry, despite relatively lower achievement at GCSE. NW medical school entrants have lower A level grades than W entrants, with an effect size of about -0.10.

The effect size for the difference between white and non-white medical school entrants is about B0.10, which would mean that for a typical medical school examination there might be about 5 NW failures for each 4 W failures. However, this effect can only explain a portion of the overall effect size found in undergraduate and postgraduate examinations of about -0.32.

High quality study Generalizability = indirect

69

Reference / Data Collection (quantitative or qualitative?)

Population & Setting (Context)

Perspective (the objective and standpoint of the study)

Intervention or Test used to evaluate the participants

Outcome (findings) Conclusion Quality assessment of study and Generalizability

McManus I C, Wakeford R. PLAB and UK graduates' performance on MRCP(UK) and MRCGP examinations: data linkage study. BMJ. 2014;348:g2612.

7829, 5135, and 4387 PLAB graduates on their first attempt at MRCP(UK) Part 1, Part 2, and PACES assessments from 2001 to 2012 compared with 18 532, 14 094, and 14 376 UK graduates taking the same assessments; 3160 PLAB1 graduates making their first attempt at the MRCGP AKT during 2007-12 compared with 14 235 UK graduates; and 1411 PLAB2 graduates making their first attempt at the MRCGP CSA during 2010-12 compared with 6935 UK graduates.

To assess whether IMGs passing PLAB1 and PLAB2 are equivalent to UK graduates at the end of the first foundation year of medical training (F1), as the GMC requires, and if not, to assess what changes in the PLAB pass marks might produce equivalence.

Data linkage of GMC PLAB performance data with data from the Royal Colleges of Physicians and the Royal College of General Practitioners on performance of PLAB graduates and UK graduates at the MRCP(UK) and MRCGP examinations.

PLAB1 marks were a valid predictor of MRCP(UK) Part 1, MRCP(UK) Part 2, and MRCGP AKT (r=0.521, 0.390, and 0.490. PLAB graduates had significantly lower MRCP(UK) and MRCGP assessments and were more likely to fail assessments and to progress more slowly than UK medical graduates. IELTS scores correlated significantly with later performance, multiple regression showing that the effect of PLAB1 (β=0.496) was much stronger than the effect of IELTS (β=0.086).

PLAB is a valid assessment of medical knowledge and clinical skills, correlating well with performance at MRCP(UK) and MRCGP. PLAB graduates’ knowledge and skills at MRCP(UK) and MRCGP are over one standard deviation below those of UK graduates. To produce equivalent performance on the MRCP and MRGP examinations, the pass mark for PLAB1 would require raising by about 27 marks (13%) and for PLAB2 by about 15-16 marks (20%).

High quality study Generalizability = direct

70

Reference / Data Collection (quantitative or qualitative?)

Population & Setting (Context)

Perspective (the objective and standpoint of the study)

Intervention or Test used to evaluate the participants

Outcome (findings) Conclusion Quality assessment of study and Generalizability

McManus C, Elder A, Dacre J. Investigating possible ethnicity and sex bias in clinical examiners: An analysis of data from the MRCP(UK) PACES and nPACES examinations. BMC Medical education. 2013;13.

Candidates at all examination centres for the first 26 diets of PACES, the original form of the examination, held from 2001/1 to 2009/2, and for the next six diets of nPACES, diets 27–32, held from 2009/3 to 2011/2.

Bias of clinical examiners against some types of candidate, based on characteristics such as sex or ethnicity, would represent a threat to the validity of an examination, since sex or ethnicity are ‘construct-irrelevant’ characteristics.

A statistical analysis comparing each examiner against a ‘basket’ of all of their co-examiners to identify examiners whose behaviour is anomalous The results of 26 diets of PACES and six diets of nPACES were examined statistically to assess the extent of hawkishness, as well as sex bias and ethnicity bias in individual examiners.

The method works well when there is more than one examiner at a station and in the case of the current MRCP(UK) clinical examination, nPACES, found possible sex bias in no examiners and possible ethnic bias in only one.

In examinations where there are two independent examiners at a station, our method can assess the extent of bias against candidates with particular characteristics. The method would be far less sensitive in exams with only a single examiner per station as examiner variance would be confounded with candidate performance variance.

High quality study Generalizability = direct

McManus I C, Woolf K, Dacre J, Paice E, Dewberry C. The Academic Backbone: longitudinal continuities in educational achievement from secondary school and medical school to MRCP(UK) and the specialist register in UK medical students and doctors. BMC Medicine. 2013;11(242).

The 1980, 1985, and 1990 cohort studies (entered medical school in 1981, 1986, and 1991), and the UCLMS Cohort Study (entered clinical studies in 2005 and 2006

Selection of medical students in the UK.

This study analyses data from five longitudinal studies of UK medical students and doctors from 1970s - 2000s. Sex and ethnic differences were also analysed in light of the changing demographics of medical students over the past decades.

There were robust correlations across different years at medical school, and medical school performance also predicted MRCP(UK) performance and being on the GMC Specialist Register. A-levels correlated somewhat less with undergraduate and post-graduate performance,

The existence of the Academic Backbone concept is strongly supported, with attainment at secondary school predicting performance in undergraduate and post-graduate medical assessments, and the effects spanning many years.

High quality study Generalizability = direct

71

Reference / Data Collection (quantitative or qualitative?)

Population & Setting (Context)

Perspective (the objective and standpoint of the study)

Intervention or Test used to evaluate the participants

Outcome (findings) Conclusion Quality assessment of study and Generalizability

Alex Matthews-King. RCGP and BAPIO collaborate to address pass rate discrepancies. PULSE. 20 June 2014.

IMG and BME students Medical news media n/a BAPIO and the British International Doctors Association jointly developing initiatives to help IMG and BME trainees

n/a

Memon M, Joughin G, Memon B. Oral assessment and postgraduate medical examinations: establishing conditions for validity, reliability and fairness. Advances in Health Sciences Education. 2010;15(2):277-89.

Postgraduate students Review to examine the practice of oral assessment in postgraduate medical education in the context of the core assessment constructs of validity, reliability and fairness.

n/a Highlights the complexity of oral assessment as an examination format, and raises concerns about the validity, reliability and fairness of such an assessment procedure for the award of certification of completion of the specialist training.

Calls for high quality published research to allay concerns about the transparency and fairness of these examinations, especially when assessing IMGs. The article concludes by proposing 15 conditions under which oral assessment is valid, reliable and fair.

n/a

Millett David. Exclusive: BAPIO hails watershed year for BME doctors. GPonline. 27 November 2014.

IMG and BME students Press release ahead of the 2014 BAPIO conference

n/a n/a BAPIO describe 2014 as a ‘watershed year’ in which the judicial review ‘woke up the establishment’.

n/a

Nash Sally. RCGP exam results reveal narrowing gaps between UK and overseas graduates. PULSE 23 January 2015.

Candidates taking the MRCGP

Medical news media n/a Narrowing gap between white UK and other graduates taking the examination as a result of ‘a number of critical interventions’.

n/a

72

Reference / Data Collection (quantitative or qualitative?)

Population & Setting (Context)

Perspective (the objective and standpoint of the study)

Intervention or Test used to evaluate the participants

Outcome (findings) Conclusion Quality assessment of study and Generalizability

Patterson Fiona, Denney Mei-Ling, Wakeford R, Good D. Fair and equal assessment in postgraduate training? British Journal of General practice. 2011:712-3.

Responds to Woolf’s meta analysis (2011)

Ethnic differences in attainment are a consistent feature of medical education in the UK. The most substantial differences being doctors taking postgraduate examinations as IMGs.

Editorial.

New research methodologies could provide original insights. Four key areas to guide further research are presented, ranging from design issues to analysing outcomes in practice.

n/a

Patterson Fiona, La-Band Analise, Koczwara Anna, Spicer John. GP National Selection Process: Equalities Impact. 2012.

Candidates for GP selection

Commissioned equalities impact project

Multi-stage project comprising a desk review and data collection. Focusing on Equality and Diversity issues in relation to selection.

Consistent findings in GP equal opportunities data over time, show that the largest group differences in performance in the national selection process relate to place of medical qualification, with UK trained candidates significantly outperforming others.

Appropriate and realistic efforts need to be made to understand and reduce group differences in performance. Additional analysis of national selection data required. Annual equalities impact monitoring required. Facilitation of qualitative research

High quality study Generalizability = direct

73

Reference / Data Collection (quantitative or qualitative?)

Population & Setting (Context)

Perspective (the objective and standpoint of the study)

Intervention or Test used to evaluate the participants

Outcome (findings) Conclusion Quality assessment of study and Generalizability

Plint S, Patterson F. Identifying critical success factors for designing selection processes into postgraduate specialty training: the case of UK general practice. Postgrad Med J 2010;86(1016):323-7.

The general practitioner recruitment process introduced machine markable short listing assessments for the first time in the UK postgraduate recruitment context, and also adopted selection centre workplace simulations.

Relatively little research on developing selection methodology for entry to postgraduate training.

Describes the history of the development of the GP recruitment for postgraduates

The recruitment process is a robust national process which has high reliability and predictive validity, and is perceived to be fair by candidates and allocates applicants equitably across the country.

The key success factors have been identified as corporate commitment to the goal of a national process, with gradual convergence maintaining locus of control rather than the imposition of change without perceived legitimate authority.

n/a

Rimmer A. Royal colleges must improve data on diversity of exam candidates BMA says. BMJ careers. 23 January 2014.

The ways in which Royal colleges gather information on protected characteristics of candidates taking exams

Medical news media n/a Highly varied development of equality and diversity training across colleges. Call by the BMA for the colleges to match public sector requirements (section 149 of the Equality Act 2010)

n/a

Rimmer A. RCP is to highlight gap in performance between overseas doctors and UK graduates. BMJ Careers. 02 December 2014.

Report on RCP presentation at the BAPIO conference

Medical news media n/a RCP looking to train and include lay people in assessment. Issues faced by IMGs are not unique to the UK

n/a

74

Reference / Data Collection (quantitative or qualitative?)

Population & Setting (Context)

Perspective (the objective and standpoint of the study)

Intervention or Test used to evaluate the participants

Outcome (findings) Conclusion Quality assessment of study and Generalizability

Rimmer A. BME doctors less likely to be offered postgraduate training posts than white doctors, says GMC. BMJ Careers. 12 March 2015.

RCP raising awareness among its examiners about differential attainment between UK and IMGs

Medical news media n/a RCP looking to train and include lay people in assessment. Issues faced by IMGs are not unique to the UK

n/a

Roberts Celia, Atkins Sarah, Hawthorne Kamila. Performance features in clinical skills assessment: Linguistic and cultural factors in the Membership of the Royal College of General Practitioners examination. London: Centre for Language, Discourse & Communication, Kings College London, 2014.

IMGs and BME UK-trained graduates. For the purpose of this study all graduates who trained abroad both from the EU and elsewhere are included in the category IMG.

To investigate the extent to which linguistic and cultural factors contribute to poor performance. To raise awareness among examiners, GP trainers and candidates of the linguistic and cultural demands of the CSA exam.

Quantitative and qualitative sociolinguistic methods supported by ethnographic information. Videoed 198 candidates over 2 exam diets and a detailed analysis of 40 cases and reviewed CSA paperwork

Decontextualised nature of the \CSA makes it ‘talk heavy’ requiring communicative fluency. Communicative performance factors contribute to gap in success rates. Higher rates of misunderstanding with role play patients. Multilingual expertise of IMGs not assessed

The need for focused training, support and preparation for specific areas of the exam

High quality study Generalizability = direct

75

Reference / Data Collection (quantitative or qualitative?)

Population & Setting (Context)

Perspective (the objective and standpoint of the study)

Intervention or Test used to evaluate the participants

Outcome (findings) Conclusion Quality assessment of study and Generalizability

Rushd S, Lndau AB, Khan JA, et al. An analysis of the performance of UK medical graduates in the MRCOG Part 1 and Part 2 written examinations. Postgrad Med J 2012;88(1039):249-54.

1335 doctors graduating in UK medical schools who entered the Part 1 MRCOG and 822 doctors taking the Part 2 MRCOG written examination for the first time between 1998 and 2008.

Evaluate the variations in performance of UK medical graduates in the MRCOG examination.

Comparison of graduate performance of UK medical schools in the two parts of the MRCOG examinations. The main outcome measures were to evaluate medical school effects, gender effects and academic performance effect.

Graduates of UK medical schools performed differently in the Part 1 and Part 2 written MRCOG examination. No gender difference in the success rates candidates in the Part 1; however, female candidates had a significantly better success rate in the Part 2 written examination than male candidates

There is variation in performance but a lack of evidence on whether graduates from different medical schools perform differently in postgraduate examinations.

High quality study Generalizability = direct

Rushd S, Landow AB, Lindow SW. An evaluation of the first time performance of international medical graduates in the MRCOG Part 1and Part 2 written examinations. European Journal of of Obstetrics and Gynaecology and Reproductive Biology 2013: 166: 124-126

11,863 candidates who appeared for the first time in Part 1 and 5336 in Part 2 (2000-2010)

To evaluate the performance of IMGs in the MRCOG Part 1and Part 2 written examinations. Candidates were grouped according to geographical bands

Retrospective analysis using RCObs/Gyne database.

Candidates from different bands performed differently

A variation in performance among IMG from different geographical regions in the Part 1 and Part 2 written MRCOG examinations

High quality study Generalizability = direct

76

Reference / Data Collection (quantitative or qualitative?)

Population & Setting (Context)

Perspective (the objective and standpoint of the study)

Intervention or Test used to evaluate the participants

Outcome (findings) Conclusion Quality assessment of study and Generalizability

Sandhu DP. Current dilemmas in overseas doctors' training. Postgrad Med J 2005;81(952):79-82.

IMGs Changes to DoH, Home Office, and deanery regulations with expansion of medical schools, implementation of European Working Time Directive, Modernising Medical Careers, and the future role of the Postgraduate Medical Education and Training Board, will have an important impact on IMGs’ training.

Opinion piece Their very success and media publicity about general practice and consultant shortages, has led to a large influx of inexperienced doctors seeking training opportunities in competitive specialties. Dissemination of realistic information about postgraduate training opportunities is important as the NHS for some time will continue to rely on IMGs.

IMGs are a remarkably successful professional group in the United Kingdom making up to 30% of the NHS work force. In 2003 a record 15 549 doctors joined the medical register of which 9336 doctors were non-European Economic Area citizens.

n/a

Schrewe B, Frost H. Finding potential in balance: navigating the competing discourses of diversity and standardization. Academic Medicine: Journal Of The Association Of American Medical Colleges 2012;87(11):1479-79.

Examines the place of the individual in the context of the profession.

Response to earlier paper on diversity. Discussion of the tension between the discourse of diversity and the discourse of standardization

n/a Asks, which common qualities make us physicians and to what extent can individual variation around these qualities be supported before the very essence of the profession begins to dissipate?

n/a

77

Reference / Data Collection (quantitative or qualitative?)

Population & Setting (Context)

Perspective (the objective and standpoint of the study)

Intervention or Test used to evaluate the participants

Outcome (findings) Conclusion Quality assessment of study and Generalizability

Shaw Q. High failure rate of ethnic minority groups in MRCGP exam comes from changes to exam and candidate selection. BMJ (Clinical Research Ed) 2013;347:f6442-f42.

References Esmail A, BMJ 2013;347:f5662. (26 September.)

Letter

n/a Unintended institutional racism resulting from changes to the RCGP combined with a disincentivization for deaneries to recruit more students.

Data needs to be re-analysed to expose the importance of language skills.

n/a

Smits PB, Verbeek JH, Nauta MC, et al. Factors predictive of successful learning in postgraduate medical education. Med Educ 2004;38(7):758-66.

A follow-up study of 118 doctors on a post-graduate occupational health training programme on the management of mental health problems.

Gender difference in learning

The following personal and contextual variables were measured as potential predictors of outcome: gender; age; years of experience as a doctor; university of graduation; learning style (Kolb); present employer (occupational health service), and educational format (problem-based or lecture-based).

After multivariate analysis female gender was positively related to accruements in both knowledge and performance independently of the influence of other factors. Accommodator learning style showed a relation with knowledge increase but had no influence on performance. The PBL format yielded a better performance outcome but had no influence on knowledge tests.

Gender and learning style found to be related to an increase in knowledge. Gender was also found to be related to improvement in performance after a postgraduate medical education programme.

High quality study Generalizability = indirect

78

Reference / Data Collection (quantitative or qualitative?)

Population & Setting (Context)

Perspective (the objective and standpoint of the study)

Intervention or Test used to evaluate the participants

Outcome (findings) Conclusion Quality assessment of study and Generalizability

Stamm M, Buddeberg-Fischer B. The impact of mentoring during postgraduate training on doctors’ career success. Medical Education. 2011;45(5):488-96.

326 doctors (172 women, 52.8%; 154 men, 47.2%) from a cohort of medical school graduates participating in the prospective SwissMedCareer Study.

SwissMedCareer Study, assessing personal characteristics, the possession of a mentor, mentoring support provided by the development network, and career success.

Study made use of a longitudinal design to investigate the impact of mentoring during postgraduate specialist training on the career success of doctors.

This study confirmed the positive impact of mentoring on career success in a cohort of Swiss doctors in a longitudinal design. However, female doctors, who are mentored less frequently than male doctors, appear to be disadvantaged in this respect.

Formal mentoring programmes could reduce barriers to mentorship and promote the career advancement of female doctors in particular.

High quality study Generalizability = indirect

79

Reference / Data Collection (quantitative or qualitative?)

Population & Setting (Context)

Perspective (the objective and standpoint of the study)

Intervention or Test used to evaluate the participants

Outcome (findings) Conclusion Quality assessment of study and Generalizability

Thomas B, Manusov EG, Wang A, et al. Contributors of black men's success in admission to and graduation from medical school. Academic Medicine: Journal Of The Association Of American Medical Colleges 2011;86(7):892-900.

In 2010, one of the authors, a black man, interviewed 10 black male medical students enrolled at Florida State University College of Medicine and 3 black male physicians associated with that school, using consensual qualitative research methodology to analyse the data.

Increasing the number of black physicians in medicine is a goal that continues to receive attention from researchers and medical schools. Currently, black Americans constitute approximately 13% of the U.S. population. They account, however, for only 4% of the U.S. physician workforce.

Qualitative research using in-depth interviews to determine characteristics and individual experiences that contribute to black men’s success in being admitted to and graduating from medical school.

The authors identified six broad contributors to successful admission and completion of medical school: social support, education, exposure to the field of medicine, group identity, faith, and social responsibility.

success for black men is achieved via a balance between educational experiences, psychosocial– cultural experiences, and personal attributes and individual perceptions. This information can be used by medical schools to strengthen their outreach programs, provide a theoretical construct for discussion and research, and generate questions for future quantitative studies.

Low quality study Generalizability = indirect

80

Reference / Data Collection (quantitative or qualitative?)

Population & Setting (Context)

Perspective (the objective and standpoint of the study)

Intervention or Test used to evaluate the participants

Outcome (findings) Conclusion Quality assessment of study and Generalizability

Tiffin PA, Illing J, Kasim AS, et al. Annual Review of Competence Progression (ARCP) performance of doctors who passed Professional and Linguistic Assessments Board (PLAB) tests compared with UK medical graduates: National data linkage study. BMJ: British Medical Journal 2014;348.

53 436 UK based trainee doctors with at least one competency related ARCP outcome reported during the study period, of whom 42 017 were UK medical graduates and 11 419 were international medical graduates who were registered following a pass from the PLAB route

To determine whether use of the PLAB examination system used to grant registration for international medical graduates results in equivalent postgraduate medical performance, as evaluated at (ARCP), between UK based doctors who qualified overseas and those who obtained their primary medical qualification from UK universities.

Observational study linking ARCP outcome data from the UK deaneries with PLAB test performance and demographic data held by the GMC.

International medical graduates were more likely to obtain a less satisfactory outcome at ARCP compared with UK graduates.

PLAB test used for registration of IMGs is not generally equivalent to the requirements for UK graduates. This may be addressed by raising the standards of English language competency required as well as the pass marks for the two parts of the PLAB test. An alternative might be to introduce a different testing system.

High quality study Generalizability = direct

Tolan AM, Kaji AH, Quach C, et al. The electronic residency application service application can predict accreditation council for graduate medical education competency-based surgical resident performance. J Surg Educ 2010;67(6):444-8.

A total of 77 residents from two (one university and one community based university-affiliate) general surgery residency programs were included in the analysis.

Electronic Residency Application Service (ERAS)

Retrospective correlation of data points found in the ERAS application with core competency- based clinical rotation evaluations. The overall competency score was defined as an average of all 6 competencies and technical skills

USMLE scores were only predictive of Medical Knowledge. Multivariable analysis showed honors in Ob/Gyn, female gender, older age, and total number of honors to be predictive of a number of individual core competencies.

The ERAS application is useful for predicting subsequent competency based performance in surgical residents. Receiving honors in the surgery clerkship, which has traditionally carried weight when evaluating a potential surgery resident, may not be as strong a predictor of future success.

High quality study Generalizability = indirect

81

Reference / Data Collection (quantitative or qualitative?)

Population & Setting (Context)

Perspective (the objective and standpoint of the study)

Intervention or Test used to evaluate the participants

Outcome (findings) Conclusion Quality assessment of study and Generalizability

Vaughan S, Sanders T, Crossley N, et al. Bridging the gap: the roles of social capital and ethnicity in medical student achievement. Medical Education 2015;49(1):114-23.

Participants were sampled across the four hospital placement sites; a total of 158 medical students in their clinical phase (Years 3 and 4) completed the survey.

An identified discrepancy between the achievement level of White students and that of their ethnic minority peers. The processes underlying this disparity have not been adequately investigated or explained.

Data from a cross-sectional social network study conducted in one UK medical school are presented and are analysed alongside examination records obtained from the medical school. Study utilises social network analysis to investigate the impact of relationships on medical student achievement by ethnicity, specifically by examining homophily.

Although significant patterns of ethnic and religious homophily emerged, no link was found between these factors and achievement. Lower levels of the social capital that mediates interaction with peers, tutors and clinicians may be the cause of underperformance by ethnic minority students.

Because of ethnic homophily, minority students may be cut off from potential and actual resources that facilitate learning and achievement.

High quality study Generalizability = indirect

82

Reference / Data Collection (quantitative or qualitative?)

Population & Setting (Context)

Perspective (the objective and standpoint of the study)

Intervention or Test used to evaluate the participants

Outcome (findings) Conclusion Quality assessment of study and Generalizability

Wakeford R, Denney M, Ludka-Stempien K, Dacre J, McManus I C. Cross-comparison of MRCGP & MRCP(UK) in a database linkage study of 2,284 candidates taking both examinations: assessment of validity amd differential performance by ethnicity. BMC Medical education. 2015;15:1.

2,284 candidates who had taken one or more parts of both assessments, MRCP(UK) typically being taken 3.7 years before MRCGP.

In the UK, underperformance of ethnic minority doctors taking MRCGP has had a high political profile. Substantial performance differences between white and BME doctors undoubtedly exist. Understanding ethnic differences can be helped by comparing the performance of doctors who take both MRCGP and MRCP(UK).

Analysis of performance on knowledge-based MCQs (MRCP(UK) Parts 1 and 2 and MRCGP Applied Knowledge Test (AKT)) and clinical examinations (MRCGP Clinical Skills Assessment (CSA) and MRCP(UK) Practical Assessment of Clinical Skills (PACES)).

Correlations between MRCGP and MRCP(UK) were high. BME candidates performed less well on all five assessments (P < .001). Correlations disaggregated by ethnicity were Complex. CSA changed its scoring method during the study; multiple regression showed the newer CSA was better predicted by PACES than the previous CSA.

High correlations between MRCGP and MRCP(UK) support the validity of each, suggesting they assess knowledge cognate to both assessments. Whilst the reason for the differential performance is unclear, the similarity of the effects in independent knowledge and clinical examinations suggests the differences are unlikely to result from specific features of either assessment and most likely represent true differences in ability.

High quality study Generalizability = direct

83

Reference / Data Collection (quantitative or qualitative?)

Population & Setting (Context)

Perspective (the objective and standpoint of the study)

Intervention or Test used to evaluate the participants

Outcome (findings) Conclusion Quality assessment of study and Generalizability

Watmough S, Bowhay A. An evaluation of the impact of country of primary medical qualification on performance in the UK Royal College odf Anaesthetists' examinations. Medical Teacher. 2011;33:938-40.

This study summarises the performance of graduates by country of primary medical qualification in part one of the UK RCA examination from 1999 to 2008.

From June 1990 to February 2008, there were 9315 attempts at the MCQ by 5797 graduates from 70 countries, with 25 countries having candidates who made 15 or more attempts

Data were collated from RCA spreadsheets for each attempt of the primary examination from June 1999 to May 2008 from the main RCA trainee database. Candidates were ranked into groups according to the country of PMQ for the overall final percentage mark of the MCQ section of the Primary Fellowship of the Royal College of Anaesthetists examination.

Candidates from Australia, New Zealand, South Africa, Zimbabwe and the UK performed significantly better than the mean for the group and candidates from Egypt, Iraq, Ireland and Pakistan performed significantly worse.

Some graduates who sit UK postgraduate exams may require additional support prior to taking these examinations.

High quality study Generalizability = direct

Woloschuk W, McLaughlin K, Wright B. Is undergraduate performance predictive of postgraduate performance? Teach Learn Med 2010;22(3):202-4.

Medical school graduates (Classes 2004–2006) at the end of the 1st postgraduate year.

To determine whether undergraduate performance is predictive of postgraduate performance.

Residency program directors assessed the performance of medical school graduates (Classes2004–2006) at the end of the 1st postgraduate year.

Correlations between undergraduate and the two postgraduate measures were low (.03–.31).

Measures of undergraduate performance appear to be poor predictors of performance in residency that consisted of two primary dimensions (clinical acumen and human sensitivity).

High quality study Generalizability = indirect

84

Reference / Data Collection (quantitative or qualitative?)

Population & Setting (Context)

Perspective (the objective and standpoint of the study)

Intervention or Test used to evaluate the participants

Outcome (findings) Conclusion Quality assessment of study and Generalizability

Woolf K, Cave J, Greenhalgh T, Dacre J. Ethnic stereotypes and the underachievement of UK medical students from ethnic minorities: qualitative study2008 2008-08-18 09:33:44.

27 year 3 medical students and 25 clinical teachers, purposively sampled for ethnicity and sex. A London medical school.

To explore ethnic stereotypes of UK medical students in the context of academic underachievement of medical students from ethnic minorities.

Qualitative study using semi structured one to one interviews and focus groups. Data were analysed using the theory of stereotype threat

The existence of a negative stereotype about their group also raises the possibility that underperformance of medical students from ethnic minorities may be partly due to stereotype threat.

Asian clinical medical students may be more likely than white students to be perceived stereotypically and negatively, which may reduce their learning by jeopardising their relationships with teachers.

High quality study Generalizability = indirect

Woolf K, Potts HWW, McManus IC. Ethnicity and academic performance in UK trained doctors and medical students: systematic review and meta-analysis. BMJ (Clinical Research Ed) 2011;342:d901-d01.

Medical students and doctors from different ethnic groups were included.

To determine whether the ethnicity of UK trained doctors and medical students is related to their academic performance. Design Systematic review and

Systematic and meta analysis. The study included quantitative reports measured the performance of medical students or UK trained doctors from different ethnic groups in undergraduate or postgraduate assessments.

Ethnic differences in academic performance are widespread across different medical schools, different types of exam, and in undergraduates and postgraduates. They have persisted for many years and cannot be dismissed as atypical or local problems

More detailed information to track the problem as well as further research into its causes is required. Such actions are necessary to ensure a fair and just method of training and of assessing current and future doctors.

High quality study Generalizability = indirect

85

Reference / Data Collection (quantitative or qualitative?)

Population & Setting (Context)

Perspective (the objective and standpoint of the study)

Intervention or Test used to evaluate the participants

Outcome (findings) Conclusion Quality assessment of study and Generalizability

Woolf K, McManus IC, Potts HWW, et al. The mediators of minority ethnic underperformance in final medical school examinations. The British Journal Of Educational Psychology 2013;83(Pt 1):135-59.

Two consecutive cohorts of Year 5 (final year) UCL Medical School students (n = 703; 51% minority ethnic). A total of 587 (83%) had previously completed a questionnaire in Year 3. Participants were then followed up to final year (2007–2010).

To investigate whether demographic and psychological factors mediate the relationship between ethnicity and final examination scores.

Participants were administered a questionnaire that included a short version of the NEO-PI-R, the Study Process Questionnaire, and the General Health Questionnaire (GHQ) as well as socio-demographic measures. Questionnaire responses and final examination grades were compared using univariate tests. The effect of ethnicity on final year grades after taking into account the questionnaire variables was calculated using hierarchical multiple linear regression.

Ethnic differences in the final year performance of two cohorts of UCL medical students were not due to differences in psychological or demographic factors, which suggests alternative explanations are responsible for the ethnic attainment gap in medicine.

UK-trained medical students and doctors from minority ethnic groups underperform academically. It is unclear why this problem exists, which makes it difficult to know how to address it.

HIGH QUALITY

Action needed to end college exam disparity. BMA. 11 April 2014.

Comment on the outcome of the judicial review

Medical news media n/a

86

Reference / Data Collection (quantitative or qualitative?)

Population & Setting (Context)

Perspective (the objective and standpoint of the study)

Intervention or Test used to evaluate the participants

Outcome (findings) Conclusion Quality assessment of study and Generalizability

GPs seek exam help for international doctors. BMA. 14 Jan 2013.

GP trainers Medical news media n/a CSA identified by GP trainers as contributing to low pass rates for IMGs

Specifically tailored training required. Suggests that all CSA exams are videoed

Low ethnic minority exam pass rates sparks call for research. BMA. 24 June 2013.

BMA annual executive meeting calls for Royal colleges to publish analysis of their exam results

Medical news media n/a

Differing pass rates raise concerns about MRCGP exam. BMA. 24 May 2013.

Concerns raised by GPs over the validity of the MRCGP given the disparity in pass rates between UK and IMG candidates

Medical news media n/a IMGs identified as not a homogenous group

GMC report highlights exam disparity. BMA. 17 March 2015.

Doctors in training Medical news media n/a Relates to the GMCs publication of Interactive reports to investigate factors that affect progression of doctors in training

87

Appendix 2: Quality evaluation of studies using primary data

Was there a clear statement of the aims if the research? Yes / No / Unclear

Was the research design appropriate to address the aims of the research? Yes / No / Unclear

Were there any issues relating to the selection of the measurements and categories in the research project? High / Low / Unclear

Was the recruitment strategy appropriate to the aims of the research?

High / Low / Unclear

Was the data collected in a way that addressed the research issue? High / Low / Unclear

Have ethical issues been taken into consideration?

High / Low / Unclear

Has the relationship between the researcher and participant been adequately considered?

High / Low / Unclear

Was the data analysis sufficiently rigorous?

High / Low / Unclear

Is there a clear and thorough statement of findings?

High / Low / Unclear

How valuable is the research?

High / Low / Unclear

Generalizability Direct / Unclear / Indirect

Abdulghani (2014)

Yes Unclear High High High High Unclear High High High Indirect

Bibbo ( 2014) Yes Yes Low High High High High High High Unclear Indirect Bohay (2009) Yes Yes High High High High High High High High Direct Cuddy (2007) Yes Yes High High High High High High High High Indirect Denney (2013) Yes Yes High high High High High High High High Direct Dewhurst (2007)

Yes Yes High High High High High high High High Direct

Esmail (2013)a

Yes Yes High High Unclear High High High High High Direct

Esmail (2013)b

Yes yes High High Unclear High High High High High Direct

Farrokhi-Khajeh-Pasha (2014)

Yes Unclear High High Unclear High Unclear Unclear Unclear High Indirect

Hawtin (2014) Yes Yes High High High High High High High High Direct Huijskens (2010)

Yes Yes High High High High Unclear High High High Indirect

Illing (2009) Yes Yes High High High High High High High High Direct Johannessen (2013)

Yes Yes High High High High High High High High Indirect

Jolly (2011) Yes Yes High High High High High High High High Indirect McManus (2008)

Yes Yes High High High High High High High High Indirect

88

McManus (2013)

Yes Yes High High High High High High High High Direct

McManus (2114)

Yes Yes High High High High High High High High Direct

Patterson (2012)

Yes Unclear High Unclear Unclear Unclear Unclear Unclear High Unclear Direct

Roberts (2014)

Yes Yes High High High High High High High High Direct

Rushd (2012) Yes Yes High High Unclear High High High High High Direct Rushd (2013) Yes Yes High High High High High High High High Direct Smits(2004) ; Yes Yes High High High High High High High High Indirect Stamm (2011) Yes Yes High High high High High High High High Indirect Thomas (2011)

Yes Yes High Hihg Low Unclear Low High High High Indirect

Tiffin (2014) Yes Yes High High High High High High High High Direct Tolan (2010) Yes Yes High High High High High High High High Indirect Vaughan (2015)

Yes Yes high high High high High High high High Indirect

Wakeford (2015)

Yes Yes High High High High High High High High Direct

Watmough (2011)

Yes Yes High High High High High High High High Direct

Woloschuk (2010)

Yes Yes Unclear High High High Low High High High Indirect

Woolf (2011)

Yes Yes High High High High High High High High Indirect

Woolf (2008) Yes Yes High High High High High high High High Indirect