Measurement Properties of the Whiplash Disability ... · Measurement Properties of the Whiplash...

Measurement Properties of the Whiplash Disability

Questionnaire in Acute Whiplash-associated Disorders

by

Maja Stupar

A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy

Institute of Health Policy Management and Evaluation University of Toronto

© Copyright by Maja Stupar, 2013

ii

Measurement Properties of the Whiplash Disability Questionnaire

in acute Whiplash-associated Disorders

Maja Stupar

Doctor of Philosophy (Clinical Epidemiology)

Institute of Health Policy, Management and Evaluation University of Toronto

2013

Abstract

Whiplash-associated disorders (WAD) include physical and psychological symptoms that may

lead to disability. However, measuring disability following whiplash injuries is challenging

because we lack valid and reliable measurement tools. The assessment of WAD-related

disability relies on self-reported instruments that are specific to neck pain and do not

comprehensively target the constructs associated with WAD-related disability. Designing new

tools and evaluating their measurement properties is challenging because of the apparent

inconsistencies in the theoretical frameworks (psychometrics and clinimetrics) used in

instrument development and in reliability, validity and responsiveness evaluation.

A scoping review design was used to develop a conceptual theory on the difference between

clinimetrics and psychometrics in order to provide recommendations for future application. The

scoping review of psychometric and clinimetric methods suggested that the two frameworks are

not as divergent as reflected in the current protracted debates. Content analysis revealed that

differences only exist in the scope of what is measured and in instrument development methods

with no operational differences in the testing phases. Based on content analysis, I developed a

iii

new framework that bridges the two measurement schools with an overlapping informed zone

between them.

I designed a cohort study of 130 participants with acute WAD to assess the measurement

properties of the Whiplash Disability Questionnaire (WDQ). The WDQ is a recently developed

instrument designed to capture the broad construct of WAD-related disability. The WDQ

measurement properties were determined in adults with WAD recruited within 21 days of their

collision. My study indicates that the WDQ and its subscales are reliable and valid for clinical

and research use. The WDQ can demonstrate change over time as a single scale or as the daily

activities subscale. However, WDQ users should be aware of its measurement error when

demonstrating change over time. Furthermore, the emotional subscale should not be used alone

to demonstrate change over six weeks because it was not responsive.

My thesis proposes a unified framework for studying the measurement properties of assessment

tools used in clinical practice. I also demonstrated that the WDQ possesses the necessary

properties to be used in patients with acute WAD.

iv !

Acknowledgments

!The journey of life is not a journey taken alone. I would like to thank everyone who has

contributed to my journey through this doctoral program.

First, I would like to thank my supervisor, Dr. Pierre Côté, for his mentorship, support and

availability throughout my doctoral journey. I thank him for challenging me and helping me

mold into the young investigator that I have become. His infectious enthusiasm for research and

excellence in scientific rigor continue to be inspiring. I was also privileged to work with a

dedicated and supportive advisory committee, Dr. Dorcas Beaton, Dr. Eleanor Boyle and Dr. J.

David Cassidy. I am thankful for their guidance and tireless feedback.

Several individuals and teams contributed support directly and indirectly in the completion my

thesis. I would like to thank everyone who contributed to the recruitment, data collection,

processing and completion of the UHN Whiplash Intervention Trial that, in turn, helped the

completion of this thesis project.

Without funding support, the completion of my doctoral program would not be possible. I would

like to thank the Canadian Institute of Health Research (CIHR) for providing three years of

financial support toward my doctoral studies through the Vanier Canada Graduate Scholarship.

My doctoral education experience was also enriched with the opportunity to study abroad at

Karolinska Institute in Stockholm, Sweden through the support of the CIHR Michael Smith

Foreign Study Supplement. I am thankful to the Department of Clinical Epidemiology and

Health Care Research within the Institute of Health Policy, Management and Evaluation at the

University of Toronto for providing additional support. Finally, without AVIVA Canada’s

vision of investing in research to improve business practices, the UHN Whiplash Intervention

Trial would not be possible and, in turn, my thesis projects would not have been completed. I

thank all these institutions for making it possible for me to dedicate time to my doctoral

education.

v !

I thank my personal friends for their smiles and laughter that made my journey that much more

enjoyable and for their support during those more challenging times.

I am thankful to my family for their unwavering support; to my wonderful parents Milica and

Ilija Stupar, for guiding me through the rollercoaster of life and teaching me the value of hard

work; and to my dear sister Biljana for always standing by me with an attentive ear and for all

her insightful advice.

vi

Table of Contents

Abstract .......................................................................................................................................... ii

Acknowledgments ........................................................................................................................ iv

Table of Contents ......................................................................................................................... vi

List of Tables ................................................................................................................................ xi

List of Figures ............................................................................................................................. xiii

List of Appendices ...................................................................................................................... xiv

List of abbreviations ................................................................................................................... xv

Preface ............................................................................................................................................ 1

Chapter 1 : Introduction .............................................................................................................. 3

1.1 Measuring disability in health research .......................................................................... 3

1.2 Epidemiology of Whiplash-associated Disorders ........................................................... 4

1.2.1 Definition .................................................................................................................... 4

1.2.2 The burden of whiplash-associated disorders in the population ................................ 5

1.2.3 Prognosis of Whiplash-associated Disorders ............................................................. 6

1.2.4 Treatment of Whiplash-associated Disorders ............................................................. 6

1.2.5 Outcome measures currently used in WAD research ................................................. 7

1.3 The measurement divide .................................................................................................. 9

1.4 Objectives........................................................................................................................... 9

1.4.1 General Objectives...................................................................................................... 9

1.4.2 Specific Objectives .................................................................................................... 10

vii

1.5 Structure of the Thesis.................................................................................................... 10

Chapter 2 : Measurement Properties: A new framework to contribute to the debate

between the field of clinimetrics and psychometrics ............................................................... 12

2.1 Introduction ..................................................................................................................... 12

2.2 Methods ............................................................................................................................ 14

2.2.1 Research question ..................................................................................................... 14

2.2.2 Search for relevant studies........................................................................................ 14

2.2.3 Study selection .......................................................................................................... 14

2.2.4 Data charting ............................................................................................................ 14

2.2.5 Collation, summarizing and reporting results including synthesis .......................... 15

2.3 Results .............................................................................................................................. 15

2.3.1 Literature search ....................................................................................................... 15

2.3.2 Study selection .......................................................................................................... 16

2.3.3 Data charting ............................................................................................................ 17

2.3.4 Collation, summarizing and reporting of results ...................................................... 19

2.3.5 Synthesis .................................................................................................................... 41

2.4 Discussion......................................................................................................................... 42

2.5 Conclusion ....................................................................................................................... 45

Chapter 3 : Can Recovery from Whiplash-associated Disorders be Measured Reliably in

Patients with Acute Whiplash-Associated Disorders? A Test-retest Reliability Study of the

Whiplash Disability Questionnaire ........................................................................................... 46

3.1 Introduction ..................................................................................................................... 46

3.2 Methods ............................................................................................................................ 47

3.2.1 Participants ............................................................................................................... 47

viii

3.2.2 Procedure .................................................................................................................. 47

3.2.3 Data........................................................................................................................... 47

3.2.4 Sample Size ............................................................................................................... 48

3.2.5 Analysis ..................................................................................................................... 48

3.2.5.1 Test-Retest Reliability ............................................................................................... 48

3.2.5.2 Minimal detectable change ....................................................................................... 49

3.2.5.3 Sensitivity Analyses ................................................................................................... 49

3.3 Results .............................................................................................................................. 49

3.3.1 Descriptive statistics ................................................................................................. 50

3.3.2 Completeness of WDQ .............................................................................................. 50

3.3.3 Test-retest reliability ................................................................................................. 51

3.3.4 Individual item test-retest reliability ......................................................................... 52

3.3.5 Minimal detectable change ....................................................................................... 54

3.4 Discussion......................................................................................................................... 54

3.5 Conclusion ....................................................................................................................... 56

3.6 Acknowledgement ........................................................................................................... 56

Chapter 4 : Exploratory Factor Analysis, Validity and Responsiveness of the Whiplash

Disability Questionnaire in Adults with Acute Whiplash-associated Disorders ................... 57

4.1 Introduction ..................................................................................................................... 57

4.2 Methods ............................................................................................................................ 58

4.2.1 Participants and Procedures .................................................................................... 58

4.2.2 Data Collection ......................................................................................................... 59

4.2.2.1 Whiplash Disability Questionnaire ........................................................................... 60

4.2.2.2 Numerical Pain Rating Scale .................................................................................... 60

ix

4.2.2.3 Neck Disability Index ................................................................................................ 60

4.2.2.4 Neck Bournemouth Questionnaire ............................................................................ 61

4.2.2.5 CES-D ....................................................................................................................... 62

4.2.2.6 SF-36 Health Survey ................................................................................................. 62

4.2.2.7 Self-report Recovery ................................................................................................. 63

4.2.3 Analysis ..................................................................................................................... 63

4.2.3.1 Descriptive statistics ................................................................................................. 63

4.2.3.2 Factor Structure ........................................................................................................ 63

4.2.3.3 Validity ...................................................................................................................... 65

4.2.3.4 Responsiveness .......................................................................................................... 65

4.2.4 Sample Size ............................................................................................................... 67

4.3 Results .............................................................................................................................. 67

4.3.1 Sample characteristics .............................................................................................. 67

4.3.2 Data completion ........................................................................................................ 70

4.3.3 Factor structure ........................................................................................................ 71

4.3.4 Validity ...................................................................................................................... 74

4.3.5 Responsiveness .......................................................................................................... 76

4.4 Discussion......................................................................................................................... 78

4.5 Conclusion ....................................................................................................................... 81

4.6 Acknowledgement ........................................................................................................... 82

Chapter 5 : Discussion ................................................................................................................ 83

5.1 Context and summary of the thesis ............................................................................... 83

5.2 Contribution of the research to the whiplash literature ............................................. 84

x

5.3 Implications of the research ........................................................................................... 86

5.4 Future research ............................................................................................................... 87

5.4.1 Content validity using qualitative methods ............................................................... 87

5.4.2 Minimizing measurement error................................................................................. 88

5.4.3 Longitudinal and structural construct validity ......................................................... 88

5.4.4 Predictive validity ..................................................................................................... 89

5.4.5 Direct comparison with other relevant instruments ................................................. 89

5.4.6 Applicability of the conceptual framework ............................................................... 89

References .................................................................................................................................... 91

Appendices ................................................................................................................................. 105

xi

List of Tables

Table 2.1: Position statement of our framework and the evidence that is in support of the

framework ..................................................................................................................................... 23

Table 2.2a: Studies using empirical methods to test differences between clinimetric and

psychometric methods .................................................................................................................. 37

Table 2.2b: Studies using empirical methods to test differences between clinimetric and

psychometric methods .................................................................................................................. 40

Table 3.1: Baseline demographic characteristics of patients with acute whiplash associated

disorders. ....................................................................................................................................... 51

Table 3.2: Intra-class Correlation Coefficient for the Total Summary Score categorized by the

report of no recovery on the change in neck pain question and memory effects .......................... 52

Table 3.3: Sensitivity Analysis for the Intra-class Correlation Coefficient for the Total Summary

Score ............................................................................................................................................. 53

Table 3.4: Intra-class Correlation Coefficient for individual items of the WDQ ........................ 53


disorders. ....................................................................................................................................... 69

Table 4.2: Baseline means, medians and normality values of WDQ total score and individual

items .............................................................................................................................................. 70

Table 4.3: Model fit statistics for the models with different number of factors in the WDQ ....... 73

Table 4.4: Factor analysis of the WDQ: The 2-factor solution ..................................................... 74

Table 4.5: Results of construct validation (n=130). A priori expected Pearson correlations

between the WDQ, its subdomains and constructs shown (E) followed by observed/achieved

results (A)...................................................................................................................................... 75

xii

List of Tables (continued)

Table 4.6: Effect size, Guyatt’s responsiveness statistic (RS) and standardize response mean

(SRM) for participants reporting recovery on the global recovery question (N=62) ................... 76

Table 4.7: Spearman’s rank correlations and AUCs for responsiveness based on the a priori

hypotheses ..................................................................................................................................... 77

xiii

List of Figures

Figure 1. 1: Data collection and data use in analysis addressing objectives two to six ............... 11!

Figure 2.1: Literature search for the measurement divide scoping review .................................. 16

Figure 2.2: Latent construct relationship with causal and indicator variables ............................. 18

Figure 2.3: Conceptual framework bridging clinimetrics and psychometrics ............................. 22

Figure 4.1: Total WDQ baseline distribution .............................................................................. 71

Figure 4.2: Factor analysis scree plot .......................................................................................... 72

xiv

List of Appendices

Appendix 1: Questionnaires ....................................................................................................... 105

A-1.1: Baseline Questionnaire ............................................................................................... 105

A-1.2: Three-to-Five Day Follow-up Questionnaire.............................................................. 122

A-1.3: Six-week Follow-up Questionnaire ............................................................................ 125

A-1.4: Addition to WIT Baseline Questionnaire ................................................................... 137

A-1.5: Addition to WIT Six-week Follow-up Questionnaire ................................................ 142

Appendix 2: Ethics Certificates ................................................................................................. 144

A-2.1: University Health Network Ethics Approval ............................................................. 144

A-2.2: University of Toronto Ethics Approval ..................................................................... 146

Appendix 3: Baseline WDQ Distributions ................................................................................ 147

Appendix 4: COSMIN Checklist completed with criteria relevant to this thesis ...................... 154

xv

List of abbreviations AIC Akaike information criteria ANOVA Analysis of Variance AUC Area under the curve CES-D Center for Epidemiologic Studies Depression Scale CINAHL Cumulative Index to Nursing and Allied Health COSMIN COnsensus-based Standards for the selection of health Measurement INstruments DASH Disabilities of the Arm, Shoulder and Hands EFA Exploratory Factor Analysis ES Effect Size GTA Greater Toronto Area ICC Intra-class Correlation Coefficient ICF International Classification of Functioning, Disability and Health framework IRT Item Response Theory KMO Kaiser-Meyer-Olkin MCID Minimal Clinically Important Difference MDC Minimal Detectable Change MeSH Medical Subject Heading NDI Neck Disability Index NPTF Neck Pain Task Force NRS Numerical Rating Scale PCA Principal Component Analysis QTF Quebec Task Force RMSR Root Mean Square Residual ROC Receiver operating characteristic SBC Schwarz Bayesian criteria SEM Standard Error of Measurement SF-36 Short-Form Health Survey containing 36 items from the Medical Outcomes Study SRM Standardized Response Mean TLRC Tucker and Lewis Reliability coefficient UHN University Health Network VAS Visual Analog Scale WAD Whiplash-associated Disorders WDQ Whiplash Disability Questionnaire WHO World Health Organization WIT Whiplash Intervention Trial WOMAC Western Ontario and McMaster Universities Osteoarthritis Index

1

Preface

Background

The general purpose of my thesis was to determine the measurement properties of the Whiplash

Disability Questionnaire (WDQ) in patients with acute Whiplash-Associated Disorders (WAD)

and to develop a conceptual theory on the difference between clinimetrics and psychometrics. In

order to meet these goals, I performed a scoping literature review and I designed a cohort study

that involved primary data collection. Potential participants for this cohort study were recruited

alongside the University Health Network (UHN) Whiplash Intervention Trial (WIT) but

participation for this study was offered regardless of their eligibility for the trial.[26] The UHN

WIT investigated the effectiveness of programs of care in improving recovery of patients with

recent WAD. The recruited population for the UHN WIT included adults who made an

insurance claim for traffic injuries to a large Ontario insurer (Aviva Canada) between February

2008 and June 2012 with WAD diagnoses Grades I-II[113] of less than 3 weeks duration.

Participants were given the opportunity to participate in both studies but the cohort study also

included WAD Grade III and had a shorter recruitment period from February 2008 to August

2009. The UHN WIT was led by Dr. Pierre Côté. I was a clinical research coordinator of the

UHN WIT and the cohort study as well as one of the co-authors.

The objectives of the research conducted for my doctoral dissertation are separate from the

randomized controlled trial lead by Dr. Pierre Côté. The UHN WIT provided the infrastructure

for recruiting participants within 21 days of their collision. Without this infrastructure, recruiting

participants with acute WAD would not be possible in Ontario for a small cohort study. Within

this infrastructure, claims adjusters identified potential study subjects when policy holders

contacted AVIVA’s claim center to report an injury. A short screening tool was designed to

assist adjusters in identifying eligible participants. The tool prompted the adjusters to inquire

about their location of residence (GTA, Barrie, Brantford, Burlington, Cambridge, Guelph,

Hamilton, Kitchener-Waterloo, New Market, Oshawa, and surrounding towns); their age (18

years or older); whether they were making an injury claim and whether their collision was within

21 days of reporting the injury. If they satisfied these conditions, the adjusters invited them to

enter a study at UHN, and asked permission to release their name and phone number to the UHN

2

research team. If they agreed, the claimant was referred immediately to one of the clinical

research coordinators and booked for eligibility assessment. The offer to participate in both

studies was given if potential participants were determined to be eligible after the history,

physical exam and, if needed, a radiological exam performed by the clinical research

coordinators. Informed consent was obtained separately for each study. Some baseline and six-

week follow-up data was the same for both studies and that data was collected only once for

participants in both studies with only a few additional questions asked for the cohort study.

These data collection procedures reduced the burden on study participants and provided the

cohort with a rich dataset appropriate for analysis of measurement properties of the WDQ in

acute WAD.

Roles and Responsibilities As a clarification of the roles and responsibilities, my specific tasks in the conduct of this

research over the past six and a half years are outlined below:

i. Designed the study and defended the protocol in May 2008;

ii. Wrote the ethics applications to the University Health Network and University of

Toronto;

iii. Coordinated participant recruitment and data collection including baseline, 3-5 day

reliability study follow-up and the 6-week responsiveness study follow-up;

iv. Developed, cleaned, validated and managed databases used for the cohort study;

v. Conducted the analysis for the test-retest reliability, factor analysis, construct validity

and responsiveness;

vi. Designed, led and contributed to the scoping review of literature as one of two

reviewers;

vii. Conceptualized the scoping review framework based on content analysis with one

other author;

viii. I was the primary author and lead writer of all the papers presented in this thesis.

3

Chapter 1 :

Introduction

1.1 Measuring disability in health research In the era of evidence-based medicine and health care accountability, measuring outcomes with

validated outcome measures is the essential building block for developing evidence and

implementing it into practice.[52,104] How health outcomes are assessed directly impacts on the

development of effective therapies and on the evaluation of their cost-effectiveness. Outcome

measures need to be validated for use in research and in clinical practice.[33] Validation of an

instrument means that measurement properties have been tested to ensure that the instrument

measures what it purports to measure and that it can accurately demonstrate change over a

clinically relevant period of time. Without adequate measurement properties, results of clinical

trials would be biased and change demonstrated in clinical settings would be inaccurate. Results

from clinical studies are only as good as the instruments used to measure outcomes in those

studies.

Unlike weight or blood pressure, many health outcomes cannot be directly measured. Therefore,

many outcomes are defined and measured as latent constructs. The development and testing of

instruments used to measure latent constructs is complex. Developers of instruments need to

consider the definition of the construct that is being measured, the time component of what is

measured (e.g. change over time, current state), items that should be included in self-report

outcome measures to capture the scope of the construct and how to score the measure.[33] Once

developed, outcome measures must be tested to establish reliability, validity, and responsiveness.

Evaluating an instrument’s ability to accurately measure latent constructs can be complicated by

the lack of consistency in the terminology and methods used in the field of measurement. I used

the definitions provided by the consensus-based standards for the selection of health

measurement instruments (COSMIN) group because they used Delphi methods to reach

consensus on taxonomy, terminology and definitions related to measurement.[89] They defined

reliability as ‘the extent to which scores for patients who have not changed are the same for

repeated measurement under several conditions’. These conditions can be categorized into

4

different types of reliability including: 1. internal consistency (e.g. using a different set of items

from the same health related-patient reported outcomes); 2. test-retest (e.g. testing change over

time); 3. inter-rater (e.g. testing the condition by different persons on the same occasion); and 4.

intra-rater reliability (e.g. testing the condition by the same raters on different occasions).[89]

Validity was defined as ‘the degree to which an instrument measures the construct(s) it purports

to measure’. Validity can also be assessed using different criteria and was categorized into: 1.

content validity (e.g. if the instrument adequately reflects the construct); 2. construct validity

(e.g. if the instrument is consistent with hypotheses relating to internal and external instrument

relationships and relevant group differences demonstrating measurement of the construct); and 3.

criterion validity (e.g. if the instrument is an adequate reflection of a gold standard).

Responsiveness was defined as ‘the ability of an instrument to detect change over time in the

construct to be measured’. Finally, the outcome measure must be interpretable meaning that a

qualitative meaning can be assigned to the quantitative or change scores to some degree.[89]

Adequate measurement properties including reliability, validity, responsiveness and

interpretability are necessary when using an instrument in clinical and research settings to ensure

accurate measurement of outcomes. However, measurement properties are specific to the

condition, setting and population in which the instrument is assessed.[119] Therefore,

researchers and clinicians must consider the conditions, settings and populations in which

instruments are to be used or to which the instrument needs to be applicable when determining

their properties.

1.2 Epidemiology of Whiplash-associated Disorders

1.2.1 Definition

The Quebec Task Force (QTF) on Whiplash-Associated Disorders defined whiplash as an

acceleration-deceleration mechanism of energy transfer to the neck which may result in bony or

soft tissue injuries that commonly occurs in motor vehicle collisions.[113] The resulting

whiplash associated disorders (WAD) are defined as a “clinical manifestation of, or the disability

caused by, whiplash injury and may include biologic, psychological, and social symptoms of the

5

potential tissue damage”.[99,113] Common WAD symptoms include neck pain, back pain,

headache, dizziness, arm pain, concentration problems and depression.[13,19,113]

1.2.2 The burden of whiplash-associated disorders in the population

Whiplash injuries are common following motor vehicle collisions. In the United States,

whiplash-related injuries were reported as the most common emergency department-treated

motor vehicle injury in 2000 with an incidence of 328 visits per 100,000 inhabitants.[101] In

2008, a systematic review of literature on the burden of neck pain and associated disorders such

as WAD was published by The 2000–2010 Bone and Joint Decade Task Force on Neck Pain and

Its Associated Disorders.[61] This systematic review estimated the annual incidence to be at

least 300 per 100,000 inhabitants in North America and western Europe.[61] It also reported that

the incidence of WAD differed substantially between countries. Similarly, a 2008 study by the

European Insurance Committee found that the incidence of minor cervical trauma (defined as a

percentage of overall claims) varied widely across ten European countries with the lowest

incidence found in France (3% of all bodily injuries) and the highest in Great Britain (76% of all

bodily injuries).[20] This study also found that the cost of minor cervical trauma varied greatly

between countries with Switzerland having higher costs (average cost of 35000 euro per claim)

compared to other European countries (average cost of 9000 euro per claim). However, these

cost differences did not reflect the difference in incidence across countries.

The incidence of WAD also varies across Canadian provinces. In Saskatchewan, the six-month

incidence of WAD was approximately 300 cases per 100,000 inhabitants in 1995.[19] WAD

were reported by 83% of all eligible participants in this cohort.[19,25] In contrast, the 12-month

incidence was 70 cases per 100,000 inhabitants in Quebec.[113] Different compensation

systems have been shown to influence the incidence and prognosis of WAD and may provide

part of the explanation for the varied reporting of injuries across provinces and countries.[19]

Multiple studies have reported the incidence of WAD is higher in women and more common in

younger ages.[25,27,61,113] It can also be influenced by several risk factors including personal,

societal, and environmental.[61]

6

1.2.3 Prognosis of Whiplash-associated Disorders

Whiplash injuries are an important cause of persistent disability. Although the QTF originally

reported that WAD is a self-limiting condition with a favourable prognosis, subsequent studies

found that the course of WAD varies greatly between jurisdictions and insurance

systems.[28,113] In Saskatchewan, the median time to recovery decreased from 433 days in

1994 for claimants under the tort system to 200 days in 1995 for those insured under the no-fault

system.[19] In contrast, the original study from the QTF on WAD in 1987, reported the median

time on compensation to be 30 days with 4.1% of individuals were receiving compensation one-

year after the collision.[28,113] A review of literature published by the NPTF on the course and

prognosis of WAD reported that approximately 50% of those with WAD will report neck pain

symptoms one-year after their injuries.[15]

1.2.3.1 Prognostic factors for Recovery From WAD

The prognosis of WAD is complex and influenced by physical and psychological factors.

Studies have found that greater initial pain intensity, more symptoms and greater initial disability

predict slower recovery from WAD.[15,135] Pre- and post-injury psychosocial factors such as

passive coping, depressed mood and fear of movement are also predictive of slower

recovery.[17,97,114] Other studies reported that sociodemographic factors (e.g. female gender,

lower education), general health before the injury and insurance/compensation systems under

which benefits can be claimed were associated with WAD recovery.[15,19,28,135] In addition,

an individual’s expectation of recovery is an important prognostic factor for delayed recovery

with those reporting poor expectations showing much slower rates of recovery than those who

expect to get better soon after their injury.[62,94,95]

1.2.4 Treatment of Whiplash-associated Disorders

Identification of effective therapies through research studies is important in providing evidence-

based care that can influence the prognosis of WAD. The NPTF systematic review on the

treatment of neck pain reported that there is evidence that educational videos, mobilization and

7

exercises appear more beneficial than usual care or physical modalities in promoting the

recovery of patients with WAD.[67] However, the role of education in the management of

WAD is being debated as evidenced by two recent systematic reviews that reached different

conclusions.[53,130] Moreover, evidence from observational studies suggests that early

intensive management of WAD may delay recovery.[29,30] Similarly, a population-based

cohort study from Saskatchewan has shown that individuals receiving fitness training and

outpatient rehabilitation had a 19-50% slower recovery from WAD.[18] The effectiveness of

rehabilitation, training programs and other health care services commonly provided to patients

with WAD needs to be determined in randomized controlled trials.[71]

For clinical trials to accurately demonstrate therapy effectiveness, appropriate outcome measures

must be used to evaluate the clinical evolution of a condition. Currently used measures in WAD

clinical trials have focused on the assessment of disability related to the neck. Considering that

WAD commonly present with a constellation of symptoms, currently used measures may be

missing the full spectrum of disability and recovery from WAD.[59] Furthermore, clinical

outcome measures must demonstrate good reliability, validity and responsiveness in order to be

useful clinically and for research purposes.[119]

1.2.5 Outcome measures currently used in WAD research

The construct of disability is difficult to define and measure. It is a concept that is not physically

tangible and can be highly contextualized; therefore, it may differ from person to person and

from situation to situation.[5] While previous definitions focused on activity limitations, the

most current International Classification of Functioning (ICF) framework proposes that disability

includes impairments, activity limitations, and participation restrictions.[142] The new ICF

model attempts to capture aspects of the condition covered not only by impairment and activity

limitation but also its effect on the individual’s participation in life events. Because it

encompasses the effect of the disability on all aspects of the individual, the ICF is a useful model

to base the measurement of WAD disability on. To be valid, self-report outcome measures need

to capture all components of a construct. Most measures currently used to measure WAD-

related disability do not have a body of evidence that supports their construct definition,

comprehensiveness, validity or reproducibility.

8

A commonly used outcome measure in whiplash research is the Neck Disability Index (NDI).

The NDI was developed to capture neck-specific disability and consists of 10 items, each with 6

response options rated from 0 (no disability) to 5 (maximal disability).[56,116,131,139] The

items include questions on pain intensity and related to the effect of neck pain on function

relevant to personal care, lifting, reading, headaches, concentration, work, driving, sleeping and

recreation.[131] The NDI has been reported to have good construct validity, reliability and

responsiveness in different populations.[56,116,139] However, it was not designed for WAD

and therefore it does not capture all aspects of WAD disability. A review of the published

literature demonstrated that the NDI omits important components of WAD disability because it

centers on neck pain.[63] Specifically, only three of nine disability items (i.e. work, driving, and

sleep) identified by WAD patients as being important are included in the NDI.[63] Other items

important to WAD patients that are not included in the NDI include fatigue, participation in

sports, depression, socializing with friends, frustration and anger.[63] Furthermore, neck pain

patients have been found to have lower general health scores based on the SF-36 outcome

measure specifically in the energy/fatigue, mental health and role-emotional domains compared

to those without neck pain.[70] Therefore, a comprehensive instrument that includes a range of

items that are important to patients would measure the construct of WAD disability more

accurately and perform better as an outcome measure.

An instrument recently developed to measure WAD disability is the Whiplash Disability

Questionnaire (WDQ). The WDQ was developed based on the ICF framework of disability.[99]

However, the developers have shown that in chronic WAD patients, the WDQ only includes one

domain/factor suggesting that it is does not fully represent the ICF framework. The

psychometric properties of the WDQ were studied in Australian patients with chronic

WAD.[49,99,140] In this population, the WDQ demonstrated good validity, reliability and

responsiveness. Recently, a German translation of the WDQ was also shown to have adequate

measurement properties for patients with chronic whiplash injuries.[87,110] However, the

WDQ’s reliability, validity and responsiveness in patients with acute WAD remain unknown.

To be useful clinically and in research, an outcome measure must have strong measurement

properties throughout the course of the condition. Moreover, because validation of an outcome

measure is specific to the population and setting studied, it is therefore necessary to establish its

measurement properties in a population of patients with acute WAD.[119]

9

1.3 The measurement divide Different schools of measurement can have different approaches to instrument development and

evaluation. Two schools relevant to health care, clinimetrics and psychometrics, have been the

source of some debate.[46,92] The international COSMIN research group recently developed a

set of measurement standards using Delphi methods to reach consensus on taxonomy,

terminology and definitions related to measurement properties for health-related patient-reported

outcomes.[89] However, this group consisted largely of researchers adhering to clinimetric

measurement methods. Clinimetrics is a measurement school developed by Feinstein and

focused on measurement relevant to clinical outcomes.[46] Psychometrics is an older school of

measurement developed in psychology with a focus on personal and interpersonal behaviour and

educational testing or examination.[92] Many of the psychometric measurement methods are

clinically relevant and have, therefore, been applied across health care. Inconsistency in methods

used to develop and evaluate instruments is complicated by the existence of these different

schools of measurement mainly because they use different theoretical and empirical

methods.[31,118,143] The debate between the two schools, clinimetrics and psychometrics, has

lead to confusion on the appropriateness of various instruments used to measure health

outcomes. I compared and contrasted the clinimetric and psychometric methods to advance this

debate and provide suggestions on the use of outcome measures in the future. As demonstrated

in the next chapter, this comparison led to the development of a conceptual framework that

integrated the theories and methods of both schools. I, therefore, use the term ‘measurement

properties’ in this thesis instead of ‘psychometric’ or ‘clinimetric properties’ when discussing the

evaluation of the WDQ. I propose that using ‘measurement properties’ will minimize confusion

and focus the discussion on properties of the instrument rather than the school of measurement.

1.4 Objectives

1.4.1 General Objectives

My first objective is to determine the measurement properties of the Whiplash Disability

Questionnaire (WDQ) in a cohort of patients with acute WAD. My second objective is to

10

analyze the divide between clinimetrics and psychometrics and develop a conceptual framework

for the evaluation of measurement properties.

1.4.2 Specific Objectives

In a clinical cohort of patients with recent WAD (less than 21 days duration), we aim to:

1.4.2.1 To clarify conceptual differences between psychometric and clinimetric methods;

1.4.2.2 Determine the short-term test-retest reliability of the WDQ;

1.4.2.3 Determine the factor structure of the WDQ;

1.4.2.4 Determine the internal consistency of the WDQ;

1.4.2.5 Determine the construct validity of the WDQ using the Neck Disability Index and

Short Form General Health Status Survey (SF-36);

1.4.2.6 Determine the short-term responsiveness of the WDQ using the global perceived

improvement question as an indicator of improvement

1.5 Structure of the Thesis This thesis is presented as a multiple-paper dissertation with five chapters: an overall

introduction, three papers that address specific objectives of the thesis and an overall discussion.

The sequence of papers was ordered to address conceptual issues first and follow the traditional

order determining measurement properties of an outcome measure. Specifically, the papers were

ordered to present reliability first, followed by validity, factor structure, internal consistency and

responsiveness. Each of the three manuscripts was written in a publishable format and includes

an introduction, a methods section, a results section and a discussion.

The thesis consists of the following three papers. Chapter Two: “Measurement Properties: A new

framework to contribute to the debate between the field of clinimetrics and psychometrics”

addresses the conceptual issues relevant to two fields within measurement leading to a new

conceptual framework which was the first objective of this thesis. Chapter Three: “Can

Recovery from Whiplash-associated Disorders be Measured Reliably in Patients with Acute

Whiplash-Associated Disorders? A Test-retest Reliability Study of the Whiplash Disability

11

Questionnaire” presents information relevant to the second objective of the thesis; specifically,

the 3-5 day test-retest reliability of the WDQ in adults with acute WAD (Figure 1.1). Chapter

Four: “Exploratory Factor Analysis, Validity and Responsiveness of the Whiplash Disability

Questionnaire in Adults with Acute Whiplash-associated Disorders” includes information

relevant to specific objectives three through six (Figure 1.1). Specifically, the factor structure,

internal consistency and construct validity of the WDQ (i.e. objective three through five) were

determined using baseline data from 130 participants with acute WAD. Objective six (i.e. short-

term responsiveness over six weeks) was established using baseline and six-week follow-up data

(Figure 1.1). All three papers will be submitted for publication before or soon after the doctoral

examination.

Figure 1.1: Data collection and data use in analysis addressing objectives two to six

This dissertation also includes several appendices. The first appendix includes interviewer-

administered baseline and follow-up questionnaires used to addresses the different specific

objectives of the thesis. Appendix 2 includes ethics approval certificates for this thesis from the

University Health Network and the University of Toronto.

12

Chapter 2 :

Measurement Properties: A new framework to contribute to the

debate between the field of clinimetrics and psychometrics

2.1 Introduction Measurement is a core science located at the heart of many intersecting health disciplines.

Consequently, different measurement paradigms have been developed to support the “metrics”

used in the various disciplines. Most common to health research are two fields: psychometrics

and clinimetrics. Psychometrics is the measurement of phenomena that are best measured by

multiple items or attributes reflecting a specific construct (i.e., anxiety or

depression).[57,92,112,145] While psychometrics is popular in health care, the second most

prevalent measurement school is clinimetrics which was introduced by Feinstein in the early

1980’s.[46] Clinimetrics focuses on prognostic and diagnostic indices, which may combine

different constructs (e.g., blood pressure, symptoms or previous risk factors) to create a

composite weighted score of risk of a distinct construct. The APGAR is an example of a robust,

well-used clinimetric index using distinct constructs (e.g. grimace, appearance, pulse) to

diagnose/classify the health status of newborns and identify newborns in need of medical

attention.[2]

Various disciplines have developed their “own” measurement theories and methodologies (e.g.,

sociometrics, biometrics, anthropometrics) which address measurement concepts central to their

discipline or area of research.[4,83,123] As previously mentioned, , the two most prevalent

schools in health research are psychometrics or clinimetrics. The theories and methodologies

promoted by these two schools are mainly used to design indicators or outcome measures for

research and clinical purposes. They will, therefore, be the focus of this paper. These two

schools are often described as discordant with proponents of each school promoting the strengths

of their approach and highlighting the limitations of the other framework.[31,118,143] We will

deconstruct these similarities and differences in order to move beyond the current debates.

This debate is more acute in the current period of patient-centered care where reimbursement is

often contingent on patient outcomes. Furthermore, regulators are increasingly aware that

13

measurement standards must be met before the benefits of an intervention are established.[128]

The “Era of Health Care Accountability” described by Relman is dependent on good

measurement reinforcing Nunally’s call that accurate measurement of key variables sets the pace

of scientific progress.[92,104,141] However, clinicians and researchers who need to select

patient-based outcomes are confronted with the tensions that exist between the two paradigms of

measurement: clinimetrics and psychometrics. This polarized debate often leads to ambiguity by

making some tools appear inadequate in their development or manifest properties when

approached from one perspective compared to the other. In this paper, we argue that

psychometrics and clinimetrics have more similarities than differences. For example, both fields

emphasize that outcome measures should be standardized, reproducible and accurate.[47,92]

They also share a common interest in the measurement of various latent constructs such as pain,

disability, self-efficacy, appraisal, perceptions and depression.

Despite their common interests, the co-existence of clinimetrics and psychometrics has given rise

to a hearty debate leading some psychometricians to question the need for clinimetrics.[39,118]

These proponents of psychometric theory suggest that the existence of clinimetrics is redundant

because it is not substantially different from the older field of psychometrics.[39,118] They

suggest that, like any classification, distinctions should continue to exist only if they facilitate

accurate communication about clinical and other outcome measures.[39,76] Alternatively,

proponents of clinimetrics suggest that their school is necessary because it offers clinically-based

methods to construct measures even if the measures are of latent constructs such as pain, anxiety

or functioning.[31,143] We propose that the current division between clinimetrics and

psychometrics creates an unneeded schism in a area where much work is needed: measurement

of key health constructs and variables. The debate has led to an unnecessary confusion in

clinical research and has created a barrier for the appropriate choice and use of measures.

Moreover, it has kept the fields separate and limited the advancement of measurement

methodology. Reconciliation between the two schools might lie in revisiting the roots of each

school rather than in continuing to debate the differences.

The purpose of our paper was twofold. First, we performed a scoping literature review to

describe the attributes of the clinimetric-psychometric divide. The aim of a scoping study is ‘to

map rapidly the key concepts underpinning a research area... especially where an area… has not

been reviewed comprehensively before’.[3] Second, we synthesized the findings and developed

14

a revised framework that highlights the similarities and differences of each, respecting the nature

of the measurement theory.

2.2 Methods

We conducted a scoping review of the literature. Our search included five stages: a)

development of a research question; b) search for relevant studies; c) study selection; d) data

charting; and e) collation, summarizing and reporting the results.[3]

2.2.1 Research question

What are the methodological similarities and differences between clinimetrics and psychometrics

in the development and evaluation of a clinical measure?

2.2.2 Search for relevant studies

We performed a literature search in Medline between 1950 and March 2012 using a combination

of the MeSH terms ‘psychometrics’ (exploded) and ‘health status’ (exploded) and text terms

‘clinimetric*’ and ‘psychometric*’. The terms were combined in the search using ‘and’ as the

combination link (i.e., MeSH ‘psychometrics’ and ‘clinimetric*’ or ‘psychometric*’ and

‘clinimetric*’). The search was limited to publications in English. We performed a similar

search in PsychINFO, CINAHL and Embase databases using the same subject headings as search

terms. Finally, we performed a textbook (title) search in the University of Toronto catalogue

using terms ‘clinimetric*’, ‘psychometric*’, ‘measurement’ and ‘health’. Article bibliographic

reference lists were also searched for relevant literature.

2.2.3 Study selection

The lead author (MS) reviewed all titles and abstracts and selected articles relevant to the

research question. An article was considered relevant if the major theme of the article was on the

comparison of clinimetrics and psychometrics.

2.2.4 Data charting

Through a series of iterative meetings (MS, DEB), we performed a content analysis of relevant

articles to identify emerging themes and stances taken by different authors. Our data charting

15

was guided by previous frameworks used to assess health indices and the measurement of

disease-specific quality of life.[55,78] These frameworks identified several categories that were

useful in categorizing our data when investigating the instrument development stages: item

selection, reduction, scaling and questionnaire formating.[55,78] Moreover, these frameworks

included stages for the process of instrument testing and evaluation including

reliability/reproducibility, validity and responsiveness. We used these categories to chart

similarities and difference between clinimetric and psychometric methods. For example, our

content analysis of articles included identification of similarities and differences in methods used

by clinimetrics and psychometrics in the stages of item selection and reduction.

2.2.5 Collation, summarizing and reporting results including synthesis

We synthesized the themes and findings from the relevant literature through iterative consensus

meetings between two of the authors (MS, DEB). This synthesis led to the development of a

position statement for our framework. We verified the results by revisiting each article to extract

features supporting or contradicting our position statement and presented it to the larger author

group (PC, JDC, EB) for debate and critique. Results from articles demonstrating empirical

testing were also summarized in a separate table to provide more relevant elements in reporting

of empirically based studies.

2.3 Results

2.3.1 Literature search

The Medline search using a combination of the MeSH term ‘psychometrics’ and the key word

‘clinimetric*’ or key words ‘psychometric*’ and ‘clinimetric*’ yielded 90 results (Figure 2.1).

A Medline search using the MeSH term ‘health status’ and the keyword ‘clinimetric*’ yielded 41

results. CINAHL, PhychINFO and EMBASE searches did not identify any new articles and

therefore are not represented in Figure 2.1.

16

MeSH: ‘Psychometrics

Text word: Clinimetric*

MeSH:‘Health Status’

90 41

AND OR

AND

Selection based on relevance to the debate:main topic of the article is the differences between clinimetrics and psychometrics

15

Duplicates or not relevant

22

7 articles added from bibliography search

Text word: Clinimetric*

AND

Text word: Psychometric*

Article Search in Medline Textbook Search

Text word: Clinimetric

Text word: Psychometric

3 180

Text word: Measurement

Text word:Health

711

AND

0

Reasons for exclusion:55 articles assessed a specific instrument20 articles not on main topic for other reasons894 textbooks not on main topic

Figure 2.1: Literature search for the measurement divide scoping review

2.3.2 Study selection

Our search yielded 15 relevant articles (Table 2.1). All articles were published in the early 1990s

following Feinstein’s description of clinimetrics in the 1980s. Additional articles were obtained

from searching article bibliographies (five articles and two replies to included articles). Five of

the relevant articles used empirical methods to test the proposed methodological differences

between clinimetrics and psychometrics (Table 2.2). The textbook search yielded three citations

(one textbook and two theses) using the term clinimetric, 180 citations using the term

17

psychometric and 711 citations for the combination of terms measurement and health. Textbook

citations were focused on the methods of each field and not on the differences between

psychometrics and clinimetrics. Therefore, we selected relevant textbooks of each field to

inform our article review.[33,47,92] Most articles (61%) were excluded because they focused on

evaluating the measurement properties of specific measures without comparing clinimetric and

psychometric methods.

2.3.3 Data charting

Based on our content analysis, we found several emerging themes and positions by different

authors. Specifically, several proponents of clinimetrics reported that the construction of

clinimetric indexes involves a deliberate combination of multiple attributes that are not expected

to produce a homogenous measure (Table 2.1).[31,32,41,42,91,143,146] They also suggested

that clinimetric measures usually contain fewer items than psychometric measures and that the

items are chosen to combine multiple clinical constructs in a single index. This contrasts with

the psychometric approach where different facets of the same construct are preferable.

Furthermore, proponents of clinimetrics propose that “dissected intuition” (defined as

stakeholder or expert input including clinician or patient input) is the fundamental distinction

between the two metric fields in the construction of instruments.[31,143] In contrast,

proponents of psychometrics suggest that there is an equal amount of “dissected intuition”

involved in constructing psychometric instruments, and that the need for the homogeneity

amongst items (i.e. internal consistency) in the outcome measure is dependent on the purpose of

the measure.[118] However, a more purist psychometric instrument development would perhaps

use less opinion or appraisal from stakeholders and focus on indicators of similarity in response

patterns (correlations, factor analysis) to determine items to keep.

Another important distinction between clinimetrics and psychometrics is the nature of the

variables included in an instrument. Clinimetricians suggest that psychometric approaches

include indicator (variables that result from the measured construct) rather than causal variables

(variables that may induce change in the measured construct rather than be the result of

it).[32,44,45] In contrast, clinimetric approaches include mainly causal variables.[32,44,45] As

pointed out by Fayers et al, this is significant because changes in the latent (measured) construct

should be directly and proportionally reflected by the indicator variables, but not necessarily by

18

the causal variables (Figure 2.2). Therefore, from a statistical perspective, a causal variable may

behave differently from an indicator variable with respect to demonstrating change in the

construct of interest.[44,45]

Figure 2.2: Latent construct relationship with causal and indicator variables

In the measurement evaluation phase of an instrument, the two schools did not demonstrate

differences. Both the proponents of clinimetrics and psychometrics suggested that measurement

properties are important and should meet accepted standards (validity, reliability, responsiveness

and interpretability) regardless of the type of development.[32,118]

19

2.3.4 Collation, summarizing and reporting of results

Feinstein proposed that the conceptual distinction between clinimetrics and psychometrics must

be based on what is being measured; specifically, whether a clinical phenomenon or a

psychosocial/educational construct is being measured.[47] However, the differences may be

more methodological in nature and relate to how measures are developed irrespective of what is

being measured. We present the results of our scoping review as a framework that collates and

summarizes the information extracted from reviewed articles. We divide our framework in three

phases: a) the development and scoring phase; b) the structure/precision phase; and c) the

measurement performance phase (Figure 2.3).

In the item development and scoring phase (Figure 2.3), the difference between the schools is

based on whether the instrument has a targeted criterion (e.g. death, neonatal survival) or an

untargeted criterion (e.g. depression, anxiety). Specifically, the main difference was the use of

clinical consensus (clinimetric) or statistical (psychometric) methods. An instrument measuring

a targeted criterion consists of a clinically-relevant, tangible concept with items that do not

correlate highly, representing multiple constructs within that concept (e.g. death causes can be

defined by several constructs that are not related to each other such as traffic accidents or

hypertension). In contrast, an instrument measuring an untargeted criterion aims to measure a

clinically-relevant, intangible concept containing a single construct with highly-correlated items

representing attributes of that construct (e.g. guilt, insomnia and agitation are attributes of

depression which correlate highly with each other).

The Apgar score, a measure of the condition of a newborn baby, is the most notable example of a

purely clinimetric index.[2,48] In developing the scale, Virginia Apgar used her clinical

experience and knowledge to select five objective signs to define the newborn baby’s

condition.[2,48] The five items (heart rate, respiratory effort, reflex irritability, muscle tone and

skin colour) are distinct constructs that are not expected to be highly correlated. These items

were chosen to define a condition that is not itself a single construct. However, it provides a

total clinical score that is predictive of neonatal survival. Other examples of clinimetric scales

are the multi-construct disease activity indexes used in rheumatology (e.g., rapid assessment of

disease activity in rheumatology [radar] questionnaire).[86,108] Statistical tests were not used in

20

their development to demonstrate high correlation between subscales, or to develop a final

homogenous instrument with highly correlated items measuring a single construct.

In contrast to clinimetric instruments, psychometric scales such as the Hamilton depression and

anxiety scales were developed using statistical methods for item selection.[57,112,145] The

Hamilton depression scale was developed using items relevant to the construct of depression

(e.g. depressed mood, suicidal thoughts, agitation, hypochondriasis) that correlate highly and that

were selected using factor analysis to produce the 17-item instrument.[57] The development of

psychometric scales relied heavily on factor analysis to identify items that are strongly

correlated. They focused on developing a homogenous set of items that measured single

constructs. These scales contain multiple related items or attributes (i.e., guilt, insomnia,

agitation) as indicators of a single concept (i.e., depression).[57] In psychometric scales, internal

consistency is expected to be high.

We found scales that used hybrid methods. These scales are located on the continuum between

the two poles presented in Figure 2.3. For example, the Whiplash Disability Questionnaire

(WDQ) was constructed using information collected from clinician expert opinion and

patients.[99] However, a more psychometric method of principal component analysis was

applied to finalize the instrument. In contrast, the Disabilities of the Arm, Shoulder and Hands

(DASH) questionnaire was developed using statistical methods.[64] Specifically, 30 items

included in the DASH were selected by subjecting 70 items to equidiscriminatory item total

correlation statistical method which selects items that are highly correlated but discriminate well

between individuals throughout the range of scores.[64] However, the statistical process was

supplemented (at the stage of final item retention) by patient opinion on the importance of items

and by clinician expert opinion. Therefore, the DASH development initially relied on

psychometric methods, but a parallel clinimetric approach was used to complete its development.

The authors intentionally used both measurement schools. These WDQ and the DASH are

examples of instruments located in the methodological continuum between the purely clinimetric

and purely psychometric approaches; the WDQ located closer to clinimetrics while the DASH is

closer to psychometrics (Figure 2.3).

Several studies have compared clinimetric and psychometric development and evaluation

methods (Table 2.2b).[6,73,85,106,127] Four studies suggested that the application of

21

clinimetric and psychometric development methods lead to the retention of different

items,[6,73,85,127] while one study reported that it resulted in the selection of different domains

within the instrument.[106] Interestingly, four studies reported that the same type of

measurement properties are assessed in both fields (i.e., reliability, validity and

responsiveness).[6,73,85,106,127] The authors found that the measurement properties were

similar regardless of how the measure was created. The clinimetrically-developed instrument

(described as the concept-retention instrument) was recommended in the study by Beaton et al

due to its similarity to the original DASH but the authors demonstrated that all three instruments

would perform similarly in terms of their measurement properties.[6] Our analysis confirms that

the strategies for evaluating the performance of outcome measures is similar for clinimetric,

psychometric and hybrid scales.[32,41] This suggests that there is no divide when assessing the

measurement properties of instruments. The same measurement properties are necessary for

both clinimetric and psychometric instruments regardless of the use of statistical methods in

development. However, some authors have suggested that face and content validity are more

important in the development of clinimetric rather than psychometric instruments since there is

more focus on clinical intuition.[31,146] Other authors suggested that the definition of construct

validity differed between the fields but these authors erroneously defined construct validity as

the assessment of internal consistency.[36] In contrast, Streiner suggested that a priori

hypotheses should be formed to assess validity based on the goal of the instrument and that this

is equally important to both fields.[118]

22

Figure 2.3: Conceptual framework bridging clinimetrics and psychometrics

23

Table 2.1: Position statement of our framework and the evidence that is in support of the framework

! "#$%&%#'!$&(&)*)'&!#+!#,-!.-(*)/#-0!! !

1,--)'&!23)$%$!

"#!$%&$&'#!()*(!+#*',%#-#.(!/'!*!0*%1#!2/#03!()*(!4&.'/'('!&2!'#5#%*0!',62/#03'!/.40,3/.1!40/./-#(%/4'!*.3!$'74)&-#(%/4'!8)/4)!-*7!&5#%0*$!/.!'&-#!*%#*'!*.3!6#!3/'(/.4(!/.!&()#%'9!!"#!$%&$&'#!()*(!()#!3/22#%#.4#'!6#(8##.!40/./-#(%/4'!*.3!$'74)&-#(%/4'!*%#!/.!()#!3#5#0&$-#.(!&2!&,(4&-#!-#*',%#'!*.3!.&(!/.!()#!-#*',%#-#.(!$#%2&%-*.4#!'(*1#9!!:,%()#%-&%#;!8#!$%&$&'#!()*(!()#%#!/'!*.!&5#%0*$!<&%!*.!/.2&%-#3!=&.#>!/.!()#!,'#!&2!3#5#0&$-#.(!(#4)./?,#'!6#(8##.!()#!2/#03!'$#4/2/4!(#4)./?,#'!*(!#/()#%!#@(%#-#9!

! !4)+%')5!678! 9,::#-&!+#-!#,-!$&(&)*)'&! 1#'&-(5%;&#-7!&#!#,-!$&(&)*)'&! 1#**)'&$!

! ! ! !

<-%=3&!)&!(>!ABBC!

D,11#'(!()*(!40/./-#(%/4!/.3#@#'!*%#!)#(#%&1#.&,'!8)/0#!$'74)&-#(%/4!/.3#@#'!*/-!2&%!)&-&1#.#/(7!

E0/./-#(%/4!/.3#@#'!(#.3!(&!)*5#!2#8#%!/(#-'!(&!2*4/0/(*(#!#*'#!&2!'4*0#!,'*1#!8)/0#!$'74)&-#(%/4!'4*0#'!/.40,3#!0*%1#!.,-6#%!&2!/(#-'!(&!/-$%&5#!)&-&1#.#/(7!

D,11#'(!()*(!3/''#4(#3!/.(,/(/&.!<40/./4*0!%#*'&./.1>!/'!,'#3!/.!40/./-#(%/4'!6,(!.&(!/.!$'74)&-#(%/4'!*.3!()*(!3#5#0&$-#.(!&2!$'74)&-#(%/4!'4*0#'!3#$#.3'!*0-&'(!4&-$0#(#07!&.!#-$/%/4*0F'(*(/'(/4*0!3#4/'/&.'!&2!8)*(!1#('!/.40,3#3!/.!()#!/.3#@!

G)#'#!*,()&%'!4%#*(#!$&0*%!3/'(/.4(/&.'!6#(8##.!40/./-#(%/4!*.3!$'74)&-#(%/4!-#()&3'!&2!'4*0#!3#5#0&$-#.(!

G)#!*,()&%'!3&!.&(!3/'4,''!/2!*.7!&5#%0*$!6#(8##.!()#!(8&!2/#03'!#@/'('!

?,'%:)-!)&!(>!ABBH!

I/%#4(!*''#''-#.(!&2!()#!(8&!*$$%&*4)#'!/.!3#5#0&$/.1!*!-#*',%#!0#*3!(&!()#!%#(#.(/&.!&2!3/22#%#.(!/(#-'!

J&.#!$%#'#.(#3! K-$/%/4*0!#5*0,*(/&.!&2!()#!40/./-#(%/4!*.3!$'74)&-#(%/4!3#5#0&$-#.(!&2!-#*',%#'!

+#*',%#-#.(!$%&$#%(/#'!.&(!*''#''#3!

.(7)-$!)&!(>!ABBH!$9!ALB!

D,11#'(!()*(!#@$0&%*(&%7!2*4(&%!*.*07'/'!<K:M>!/'!/.*$$%&$%/*(#!/.!()#!3#5#0&$-#.(!&2!40/./-#(%/4!/.3/4#'!6#4*,'#!K:M!4*..&(!-&3#0!/.3/4*(&%'!()*(!)*5#!*!4*,'*0!#22#4(!&.!()#!0*(#.(!4&.'(%,4(!

D,11#'(!()*(!K:M!4*.!0#*3!(&!/.4&.'/'(#.(!%#',0('!*4%&''!'(,3/#'!8)#.!*$$0/#3!(&!/.3/4#'!8/()!4*,'*0!/.3/4*(&%'!6#4*,'#!4&%%#0*(/&.'!6#(8##.!4*,'*0!/.3/4*(&%'!-*7!.&(!%#20#4(!()#!-*./2#'(*(/&.!&2!*!4)*.1#!/.!()#!0*(#.(!2*4(&%!

J&.#!$%#'#.(#3! N.!',$$&%(!&2!3/22#%#.(/*0!,'#!&2!2*4(&%!*.*07'/'!3#$#.3/.1!&.!()#!(7$#'!&2!/(#-'!/.40,3#3!/.!()#!/.'(%,-#.(!8)/4)!-*7!%#20#4(!-&%#!40/./-#(%/4!/.'(%,-#.('!6,(!(&!4&.(/.,#!/('!,'#!/.!$'74)&-#(%/4'!

O'#3!0/(#%*(,%#!&.!(8&!3/22#%#.(!/.'(%,-#.('!*'!#@*-$0#'!(&!3#-&.'(%*(#!',$$&%(!2&%!/.*$$%&$%/*(#.#''!&2!K:M!2&%!40/./-#(%/4!/.'(%,-#.('!

24


! "#$%&%#'!$&(&)*)'&!#+!#,-!.-(*)/#-0!! !

1,--)'&!23)$%$!

"#!$%&$&'#!()*(!+#*',%#-#.(!/'!*!0*%1#!2/#03!()*(!4&.'/'('!&2!'#5#%*0!',62/#03'!/.40,3/.1!40/./-#(%/4'!*.3!$'74)&-#(%/4'!8)/4)!-*7!&5#%0*$!/.!'&-#!*%#*'!*.3!6#!3/'(/.4(!/.!&()#%'9!!"#!$%&$&'#!()*(!()#!3/22#%#.4#'!6#(8##.!40/./-#(%/4'!*.3!$'74)&-#(%/4'!*%#!/.!()#!3#5#0&$-#.(!&2!&,(4&-#!-#*',%#'!*.3!.&(!/.!()#!-#*',%#-#.(!$#%2&%-*.4#!'(*1#9!!:,%()#%-&%#;!8#!$%&$&'#!()*(!()#%#!/'!*.!&5#%0*$!<&%!*.!/.2&%-#3!=&.#>!/.!()#!,'#!&2!3#5#0&$-#.(!(#4)./?,#'!6#(8##.!()#!2/#03!'$#4/2/4!(#4)./?,#'!*(!#/()#%!#@(%#-#9!

! !4)+%')5!678! 9,::#-&!+#-!#,-!$&(&)*)'&! 1#'&-(5%;&#-7!&#!#,-!$&(&)*)'&! 1#**)'&$!.(7)-$!)&!(>!ABBH!$9!LBL!

D,11#'(!()*(!?,*0/(7!&2!0/2#!/.'(%,-#.('!)*5#!*!4&-6/.*(/&.!&2!4*,'*0!*.3!#22#4(!/.3/4*(&%'!',11#'(/.1!()*(!()#7!*%#!.#/()#%!$,%#07!40/./-#(%/4!.&%!$,%#07!$'74)&-#(%/4!

O'#3!*!10&6*0!?,*0/(7!&2!0/2#!?,#'(/&.!(&!3#-&.'(%*(#!4&%%#0*(/&.!8/()!()#!#22#4(!/.3/4*(&%'!*.3!*!0*4P!&2!4&%%#0*(/&.!8/()!$&(#.(/*007!4*,'*0!/.3/4*(&%'!

D,11#'(!()*(!'(*(/'(/4*0!-#()&3'!',4)!*'!2*4(&%!*.*07'/'!*%#!.&(!*$$%&$%/*(#!8)#.!*''#''/.1!4*,'*0!/.3/4*(&%'!6#4*,'#!4*,'*0!/.3/4*(&%'!3&!.&(!%#20#4(!()#!0*(#.(!4&.'(%,4(!</9#9!()#7!-*7!6#!'/3#Q#22#4('!&2!*!4&.3/(/&.!4*,'/.1!()#!4)*.1#'!/.!()#!0*(#.(!4&.'(%,4(>!

J&.#!$%#'#.(#3! D,$$&%(!2&%!()#!/.2&%-#3!=&.#!*.3!()#!3/22#%#.4#'!/.!3#5#0&$-#.(!&2!40/./-#(%/4!*.3!$'74)&-#(%/4!/.'(%,-#.('!

M3-/(!()*(!()#%#!/'!.&!3#2/./(#!8*7!(&!3#(#%-/.#!/2!/.3/4*(&%'!*%#!4*,'*0!6,(!',11#'(!'&-#!-#()&3'!()*(!*%#!',11#'(/5#!&2!()#'#!/(#-'!6#)*5/.1!3/22#%#.(07!()*.!#22#4(!/.3/4*(&%'!8)#.!',6-/((#3!(&!'(*(/'(/4*0!*.*07'/'!

M0()&,1)!()#!-#()&3'!,'/.1!3*(*!*%#!.&(!.#4#''*%/07!#-$/%/4*0!$%&&2!&2!3/22#%#.4#';!',11#'(!()*(!()#%#!/'!*!3/22#%#.4#!/.!()#!(7$#'!&2!/.'(%,-#.('!*.3!()*(!4#%(*/.!'(*(/'(/4*0!-#()&3'!')&,03!.&(!6#!*$$0/#3!/.!#5#%7!4*'#!

25


! "#$%&%#'!$&(&)*)'&!#+!#,-!.-(*)/#-0!! !

1,--)'&!23)$%$!

"#!$%&$&'#!()*(!+#*',%#-#.(!/'!*!0*%1#!2/#03!()*(!4&.'/'('!&2!'#5#%*0!',62/#03'!/.40,3/.1!40/./-#(%/4'!*.3!$'74)&-#(%/4'!8)/4)!-*7!&5#%0*$!/.!'&-#!*%#*'!*.3!6#!3/'(/.4(!/.!&()#%'9!!"#!$%&$&'#!()*(!()#!3/22#%#.4#'!6#(8##.!40/./-#(%/4'!*.3!$'74)&-#(%/4'!*%#!/.!()#!3#5#0&$-#.(!&2!&,(4&-#!-#*',%#'!*.3!.&(!/.!()#!-#*',%#-#.(!$#%2&%-*.4#!'(*1#9!!:,%()#%-&%#;!8#!$%&$&'#!()*(!()#%#!/'!*.!&5#%0*$!<&%!*.!/.2&%-#3!=&.#>!/.!()#!,'#!&2!3#5#0&$-#.(!(#4)./?,#'!6#(8##.!()#!2/#03!'$#4/2/4!(#4)./?,#'!*(!#/()#%!#@(%#-#9!

! !4)+%')5!678! 9,::#-&!+#-!#,-!$&(&)*)'&! 1#'&-(5%;&#-7!&#!#,-!$&(&)*)'&! 1#**)'&$!.)%'$&)%'!ABBB!

O'/.1!()#!M$1*%!'4&%#!*'!*.!#@*-$0#!&2!*!40/./-#(%/4!/.3#@;!',11#'('!()*(!3/''#4(#3!/.(,/(/&.!&%!40/./4*0!R,31-#.(;!#*'#!&2!,.3#%'(*.3/.1!*.3!#*'#!&2!*$$0/4*(/&.!*%#!/-$&%(*.(!/.!40/./-#(%/4'!6,(!.&(!*'!-,4)!/.!$'74)&-#(%/4'!

S#(#%&1#.#/(7!/-$&%(*.(!/.!40/./-#(%/4'!6,(!)&-&1#.#/(7!/-$&%(*.(!/.!$'74)&-#(%/4'!

D,11#'('!()*(!()#%#!/'!.&!#5/3#.4#!()*(!&.#!-#()&3!/'!6#((#%!()*.!()#!&()#%!6,(!()*(!40/./4/*.'!')&,03!.&(!%#07!'&0#07!&.!'(*(/'(/4*0!-#()&3'!/.!3#5#0&$/.1!-#*',%#'!'/.4#!',44#''2,0!-#()&3'!&2!&.#!2/#03!*%#!.&(!.#4#''*%/07!(%*.'$&%(*60#!(&!*.&()#%!2/#03!

!

T'74)&-#(%/4!-#()&3'!%#R#4(!3/5#%'/(7!*.3!'/-$0#!%*(/.1'!/.!-#*',%#'!!

M%1,#'!()*(!2*4#!5*0/3/(7!/'!4%,4/*0!/.!40/./-#(%/4'!6,(!.&(!/.!$'74)&-#(%/4'!6#4*,'#!$#%(/.#.(!(&$/4'!-*7!6#!&-/((#3!3,#!(&!$&&%!'(*(/'(/4*0!4&%%#0*(/&.!8/()!()#!%#(*/.#3!/(#-'!

E&Q*,()&%#3!"%/1)(!#(!*0!$*$#%!*.3!,'#3!'/-/0*%!$&0*%!3/'(/.4(/&.'!6#(8##.!40/./-#(%/4!*.3!$'74)&-#(%/4!-#()&3'!&2!'4*0#!3#5#0&$-#.(!

D(*(#3!()*(!()#%#!/'!.&!#5/3#.4#!()*(!&.#!-#()&3!/'!6#((#%!()*.!()#!&()#%!6,(!()*(!40/./4*0!R,31-#.(!')&,03!$0*7!*!%&0#!/.!3#5#0&$/.1!-#*',%#'!

D,$$&%(!2&%!3/22#%#.4#'!/.!3#5#0&$-#.(!'(*1#'!

@(-A!)&!(>!ABBB!

I/22#%#.(!/(#-'!*%#!%#(*/.#3!67!3/22#%#.(!-#()&3'!%#',0(/.1!/.!'0/1)(07!3/22#%#.(!2/.*0!-#*',%#'!

G)#!-#*',%#-#.(!$%&$#%(/#'!8#%#!*''#''#3!*.3!'*(/'2*4(&%7!2&%!6&()!2/.*0!-#*',%#'!

I/''#4(#3!/.(,/(/&.!8*'!,'#3!67!6&()!-#()&3'!(&!2/.*0/=#!()#!-#*',%#'!

U.4#!V3/''#4(#3!/.(,/(/&.W!8*'!,'#3!(&!4)*.1#!()#!/(#-'!(&!2/.*0/=#!()#!'4*0#;!()#!(8&!-#()&3'!%#',0(#3!/.!'/-/0*%!'4*0#'!8/()!'/-/0*%!-#*',%#-#.(!$%&$#%(/#'!

!

J&.#!$%#'#.(#3! N(#-'!%#(*/.#3!/.!-#*',%#'!3#5#0&$#3!67!()#!(8&!-#()&3'!3/22#%#3!6,(!8#%#!-&%#!'/-/0*%!&.4#!40/./4*0!R,31-#.(!8*'!,'#3!/.!6&()!4*'#'!(&!2/.*0/=#!()#!/.'(%,-#.('!<',$$&%(/5#!&2!&,%!/.2&%-#3!=&.#>!

+#()&3'!(&!*''#''!-#*',%#-#.(!$%&$#%(/#'!8#%#!()#!'*-#!/%%#'$#4(/5#!&2!()#!3#5#0&$-#.(!-#()&3!,'#3!(&!4%#*(#!()#!/.'(%,-#.(!<',$$&%(!2&%!()#!-#*',%#-#.(!$#%2&%-*.4#!'(*1#'>!

!

26


! "#$%&%#'!$&(&)*)'&!#+!#,-!.-(*)/#-0!! !

1,--)'&!23)$%$!

"#!$%&$&'#!()*(!+#*',%#-#.(!/'!*!0*%1#!2/#03!()*(!4&.'/'('!&2!'#5#%*0!',62/#03'!/.40,3/.1!40/./-#(%/4'!*.3!$'74)&-#(%/4'!8)/4)!-*7!&5#%0*$!/.!'&-#!*%#*'!*.3!6#!3/'(/.4(!/.!&()#%'9!!"#!$%&$&'#!()*(!()#!3/22#%#.4#'!6#(8##.!40/./-#(%/4'!*.3!$'74)&-#(%/4'!*%#!/.!()#!3#5#0&$-#.(!&2!&,(4&-#!-#*',%#'!*.3!.&(!/.!()#!-#*',%#-#.(!$#%2&%-*.4#!'(*1#9!!:,%()#%-&%#;!8#!$%&$&'#!()*(!()#%#!/'!*.!&5#%0*$!<&%!*.!/.2&%-#3!=&.#>!/.!()#!,'#!&2!3#5#0&$-#.(!(#4)./?,#'!6#(8##.!()#!2/#03!'$#4/2/4!(#4)./?,#'!*(!#/()#%!#@(%#-#9!

! !4)+%')5!678! 9,::#-&!+#-!#,-!$&(&)*)'&! 1#'&-(5%;&#-7!&#!#,-!$&(&)*)'&! 1#**)'&$!B7C('$0%!ABBB!

M1%##!8/()!:#/.'(#/.!()*(!40/./-#(%/4'!/'!'/1./2/4*.(!&.!/('!&8.!*.3!()*(!/(!/'!*.!*$$%&$%/*(#!'(*%(/.1!$&/.(!2&%!'(*(/'(/4*0!'4*0#!3#5#0&$-#.(!6#4*,'#!3/5#%'/(7!6*'#3!&.!40/./4*0!P.&80#31#!')&,03!6#!/.40,3#3!(&!'*(/'27!2*4#!*.3!4&.(#.(!5*0/3/(7!

N./(/*0!*3)#%#.4#!(&!'(*(/'(/4*0!*.3!$'74)&-#(%/4!-#()&3'!-*7!6#!*$$%&$%/*(#!#5#.!/2!()#!,0(/-*(#!1&*0!/'!*!40/./-#(%/4!-#*',%#!

DP/00'!*.3!/.'/1)('!&2!6&()!()#!40/./4/*.!*.3!()#!$'74)&-#(%/4/*.!*%#!%#?,/%#3!/.!3#5#0&$-#.(!&2!-#*',%#'!'/.4#!.#/()#%!*$$%&*4)!*0&.#!/'!',22/4/#.(!

E0/./4*0!/.'/1)(!/'!()#!#''#.(/*0!/.1%#3/#.(!-/''/.1!2%&-!$'74)&-#(%/4!-#()&3'!8)/4)!'&$)/'(/4*(#3!'(*(/'(/4'!4*..&(!$%&5/3#!

K3/(&%/*0!/.!',$$&%(!2&%!()#!3#5#0&$-#.(*0!3/22#%#.4#'!6#(8##.!40/./-#(%/4'!*.3!$'74)&-#(%/4'!

D,$$&%('!&,%!2%*-#8&%P!/.!(#%-'!&2!()#!/.2&%-#3!=&.#!6#4*,'#!',11#'('!()*(!-#()&3'!,'#3!6&()!67!40/./4/*.'!*.3!$'74)&-#(%/4/*.'!*%#!%#?,/%#3!/.!-#*',%#!3#5#0&$-#.(!

27


! "#$%&%#'!$&(&)*)'&!#+!#,-!.-(*)/#-0!! !

1,--)'&!23)$%$!

"#!$%&$&'#!()*(!+#*',%#-#.(!/'!*!0*%1#!2/#03!()*(!4&.'/'('!&2!'#5#%*0!',62/#03'!/.40,3/.1!40/./-#(%/4'!*.3!$'74)&-#(%/4'!8)/4)!-*7!&5#%0*$!/.!'&-#!*%#*'!*.3!6#!3/'(/.4(!/.!&()#%'9!!"#!$%&$&'#!()*(!()#!3/22#%#.4#'!6#(8##.!40/./-#(%/4'!*.3!$'74)&-#(%/4'!*%#!/.!()#!3#5#0&$-#.(!&2!&,(4&-#!-#*',%#'!*.3!.&(!/.!()#!-#*',%#-#.(!$#%2&%-*.4#!'(*1#9!!:,%()#%-&%#;!8#!$%&$&'#!()*(!()#%#!/'!*.!&5#%0*$!<&%!*.!/.2&%-#3!=&.#>!/.!()#!,'#!&2!3#5#0&$-#.(!(#4)./?,#'!6#(8##.!()#!2/#03!'$#4/2/4!(#4)./?,#'!*(!#/()#%!#@(%#-#9!

! !4)+%')5!678! 9,::#-&!+#-!#,-!$&(&)*)'&! 1#'&-(5%;&#-7!&#!#,-!$&(&)*)'&! 1#**)'&$!.(7)-$!)&!(>!CXXC!

D,11#'(!()*(!4*,'*0!5*%/*60#'!-*7!*44&,.(!2&%!()#!3/22#%#.4#!/.!40/./-#(%/4!*.3!$'74)&-#(%/4!/.'(%,-#.('!*.3!#@$0*/.!()*(!6&()!*%#!.##3#3!

M%1,#!()*(!4*,'*0!5*%/*60#'!*22#4(!()#!0*(#.(!5*%/*60#;!.&(!'/-$07!/.3/4*(&%'!&2!/(!<*'!#22#4(!/.3/4*(&%'!,'#3!/.!$'74)&-#(%/4!'4*0#'!8&,03!6#>!!

M0'&!*%1,#!()*(!4*,'*0!5*%/*60#'!*%#!',22/4/#.(!6,(!.&(!.#4#''*%7!4&-$&.#.(!4*,'#'!&2!*!0*(#.(!4&.'(%,4(!<,'/.1!'(*.3*%3/=#3!#?,*(/&.!-&3#0/.1!(#%-/.&0&17>!

D,11#'(!()*(!(%*3/(/&.*0!$'74)&-#(%/4!'(*(/'(/4*0!*$$%&*4)#'!</9#9!2*4(&%!*.*07'/'>!-*7!6#!/.*$$%&$%/*(#!2&%!40/./-#(%/4!/.3#@!3#5#0&$-#.(!6#4*,'#!4&5*%/*.4#'!6#(8##.!4*,'*0!5*%/*60#'!-*7!#@/'(!%#1*%30#''!&2!()#/%!%#0*(/&.')/$!8/()!()#!,.3#%07/.1!2*4(&%!8)/4)!-*7!',11#'(!*33/(/&.*0!2*4(&%'!3/'(/.4(!2%&-!()#!/.3/4*(&%'!

D,11#'(!()*(!.&!3*(*!*.*07'/'!/'!.#4#''*%7!(&!3#4/3#!)&8!(&!4&-6/.#!/.3/5/3,*0!/(#-'!&2!*!40/./-#(%/4!-&3#0!*.3!(&!3#(#%-/.#!()#!%#0*(/5#!/-$&%(*.4#!&2!/(#-'!6,(!()*(!()/'!/'!.&(!*!')&%(4&-/.1!6#4*,'#!()#!*/-!&2!40/./-#(%/4!*.3!$'74)&-#(%/4!/.'(%,-#.(!3#5#0&$-#.(!/'!3/22#%#.(!

M0()&,1)!',11#'(!()*(!4&.'(%,4(/&.!*.3!*''#''-#.(!3/22#%'!6#(8##.!-#()&3';!()#!'*-#!(7$#'!&2!$#%2&%-*.4#!-#()&3'!*%#!,'#3!(&!*''#''!6&()!</9#9!5*0/3/(7;!%#0/*6/0/(7;!%#'$&.'/5#.#''>!

D,$$&%(!()*(!3/22#%#.(!/.'(%,-#.('!8/00!0/P#07!%#',0(!2%&-!,.4%/(/4*0!*$$0/4*(/&.!&2!#/()#%!-#()&3!*.3!#/()#%!-/1)(!6#!/.*3#?,*(#!

D,11#'(!()*(!-#()&3'!&2!5*0/3*(/.1!/.'(%,-#.('!3#$#.3'!&.!()#!(7$#!&2!/.'(%,-#.(!*.3!()*(!/.(#%Q/(#-!4&%%#0*(/&.!*.3!/.(#%.*0!4&.'/'(#.47!')&,03!.&(!6#!#@$#4(#3!(&!6#!)/1)!/.!/.'(%,-#.('!4&.(*/./.1!4*,'*0!/(#-'!

D,11#'(!()*(!4&.(#.(!5*0/3/(7!-*7!6#!-&%#!/-$&%(*.(!/.!40/./-#(%/4'!6#4*,'#!&-/''/&.!&2!*.!/(#-!()*(!/'!.&(!4&%%#0*(#3!(&!&()#%!/(#-'!6#4*,'#!/(!/'!4*,'*0!-*7!)*5#!*!1%#*(#%!/-$*4(!&.!()#!$#%2&%-*.4#!&2!()#!/.'(%,-#.(!()#.!/.!$'74)&-#(%/4!/.'(%,-#.('!8)#%#!&-/''/&.!&2!&.#!/(#-!(*$$/.1!/.(&!()#!'*-#!,.3#%07/.1!4&.4#$(!-*7!)*5#!.&!#22#4(!&.!$#%2&%-*.4#!

D,11#'(!()*(!3/22#%#.(/*0!/(#-!2,.4(/&./.1!4*.!6#!,'#3!(&!3/22#%#.(/*(#!6#(8##.!/.3/4*(&%!*.3!4*,'*0!5*%/*60#'!*.3!()*(!'4*0#'!')&,03!/.!1#.#%*0!/.40,3#!&.#!&%!()#!&()#%;!.&(!6&()!

D,11#'(!()*(!$'74)&-#(%/4!-#()&3'!4*.!6#!,'#3!/.!-,0(/Q/(#-!'4*0#'!(&!#@*-/.#!',6'#('!&2!/.3/4*(&%!5*%/*60#'!6,(!.&(!4&-6/.#3!8/()!4*,'*0!5*%/*60#'!

D,$$&%(!2&%!()#!3/22#%#.4#!/.!3#5#0&$-#.(!'(*1#'!&2!&,%!2%*-#8&%P!8/()!*.!#@$0*.*(/&.!,'/.1!4*,'*0!*.3!/.3/4*(&%!5*%/*60#'!(&!#@$0*/.!8)7!'&-#!'(*(/'(/4*0!-#()&3'!</9#9!2*4(&%!*.*07'/'>!-*7!2*/0!2&%!40/./-#(%/4!/.3/4#'!

D,$$&%(!2&%!()#!-#*',%#-#.(!$#%2&%-*.4#!'(*1#!&2!&,%!2%*-#8&%P!6#4*,'#!()#!'*-#!(7$#!&2!-#*',%#!*''#''-#.('!*%#!',11#'(#3!67!6&()!-#()&3'!

28


! "#$%&%#'!$&(&)*)'&!#+!#,-!.-(*)/#-0!! !

1,--)'&!23)$%$!

"#!$%&$&'#!()*(!+#*',%#-#.(!/'!*!0*%1#!2/#03!()*(!4&.'/'('!&2!'#5#%*0!',62/#03'!/.40,3/.1!40/./-#(%/4'!*.3!$'74)&-#(%/4'!8)/4)!-*7!&5#%0*$!/.!'&-#!*%#*'!*.3!6#!3/'(/.4(!/.!&()#%'9!!"#!$%&$&'#!()*(!()#!3/22#%#.4#'!6#(8##.!40/./-#(%/4'!*.3!$'74)&-#(%/4'!*%#!/.!()#!3#5#0&$-#.(!&2!&,(4&-#!-#*',%#'!*.3!.&(!/.!()#!-#*',%#-#.(!$#%2&%-*.4#!'(*1#9!!:,%()#%-&%#;!8#!$%&$&'#!()*(!()#%#!/'!*.!&5#%0*$!<&%!*.!/.2&%-#3!=&.#>!/.!()#!,'#!&2!3#5#0&$-#.(!(#4)./?,#'!6#(8##.!()#!2/#03!'$#4/2/4!(#4)./?,#'!*(!#/()#%!#@(%#-#9!

! !4)+%')5!678! 9,::#-&!+#-!#,-!$&(&)*)'&! 1#'&-(5%;&#-7!&#!#,-!$&(&)*)'&! 1#**)'&$!D(-&3#>#*)/!CXXC!<%#$07!(&!:*7#%'!#(!*0!CXXC>!

E&00*6&%*(/&.!6#(8##.!'(*(/'(/4/*.'!*.3!$%*4(/(/&.#%'!/.!$'74)&0&17!*.3!-#3/4/.#!<()&'#!4&.'(%,4(/.1!40/./4*0!*.3!$'74)&-#(%/4!/.'(%,-#.('>!/'!.##3#3!(&!/-$%&5#!,.3#%'(*.3/.1!&2!-#()&3'>!

:*4(&%!*.*07'/'!3&#'!.&(!8&%P!8)#.!*$$0/#3!(&!*!-/@(,%#!&2!4*,'*0!*.3!/.3/4*(&%!5*%/*60#'!

D,11#'(!()*(!#22#4(!/.3/4*(&%'!*%#!/.3/4*(&%'!&2!6&()!()#!0*(#.(!5*%/*60#!*.3!()#!4*,'*0!/.3/4*(&%'!<-#3/*(/.1!()#!0*(#.(!5*%/*60#>!*.3!',11#'(!()*(!4*,'*0!/.3/4*(&%'!3&!.&(!.##3!(&!6#!/.40,3#3!/.!/.'(%,-#.('!*'!0&.1!*'!*00!*$$%&$%/*(#!#22#4(!/.3/4*(&%'!*%#!/.40,3#3!

D,$$&%(!2&%!()#!/.2&%-#3!/.!()#!'#.'#!()*(!3/22#%#.(!#@$#%('!)*5#!(&!8&%P!(&1#()#%!(&!/-$%&5#!-#()&3'!

D,11#'(!()*(!'(*(/'(/4*0!*.3!$'74)&-#(%/4!*$$%&*4)#'!*%#!'/-/0*%!*.3!()*(!2*4(&%!*.*07'/'!/'!.&(!*$$%&$%/*(#!/2!4*,'*0!2*4(&%'!*%#!/.40,3#3!

29


! "#$%&%#'!$&(&)*)'&!#+!#,-!.-(*)/#-0!! !

1,--)'&!23)$%$!

"#!$%&$&'#!()*(!+#*',%#-#.(!/'!*!0*%1#!2/#03!()*(!4&.'/'('!&2!'#5#%*0!',62/#03'!/.40,3/.1!40/./-#(%/4'!*.3!$'74)&-#(%/4'!8)/4)!-*7!&5#%0*$!/.!'&-#!*%#*'!*.3!6#!3/'(/.4(!/.!&()#%'9!!"#!$%&$&'#!()*(!()#!3/22#%#.4#'!6#(8##.!40/./-#(%/4'!*.3!$'74)&-#(%/4'!*%#!/.!()#!3#5#0&$-#.(!&2!&,(4&-#!-#*',%#'!*.3!.&(!/.!()#!-#*',%#-#.(!$#%2&%-*.4#!'(*1#9!!:,%()#%-&%#;!8#!$%&$&'#!()*(!()#%#!/'!*.!&5#%0*$!<&%!*.!/.2&%-#3!=&.#>!/.!()#!,'#!&2!3#5#0&$-#.(!(#4)./?,#'!6#(8##.!()#!2/#03!'$#4/2/4!(#4)./?,#'!*(!#/()#%!#@(%#-#9!

! !4)+%')5!678! 9,::#-&!+#-!#,-!$&(&)*)'&! 1#'&-(5%;&#-7!&#!#,-!$&(&)*)'&! 1#**)'&$!4%E0)-$!)&!(>!CXXL!

D,11#'(!()*(!40/./-#(%/4!*.3!$'74)&-#(%/4'!3/22#%!/.!3#5#0&$-#.(!'(*1#'!6#4*,'#!$'74)&-#(%/4'!*/-'!2&%!,./3/-#.'/&.*0!'4*0#'!/.40,3/.1!&.07!#22#4(!/.3/4*(&%'!8)/0#!40/./-#(%/4!/.3/4#'!/.40,3#!-,0(/$0#!*((%/6,(#'!()*(!4&,03!6#!8#/1)(#3!3/22#%#.(07!2&%!1%#*(#%!%#'$&.'/5#.#''!*.3!/.40,3#!4*,'*0!/.3/4*(&%'!*'!8#00!*'!#22#4(!/.3/4*(&%'!

D,11#'(!()*(!()#!40/./-#(%/4!*$$%&*4)!/'!-&%#!*$$%&$%/*(#!/.!*''#''/.1!#.5/%&.-#.('!*.3!()*(!3/22#%#.4#!/.!3#5#0&$-#.(!-*7!%#',0(!/.!3/22#%#.4#'!/.!()#!%#',0('!&2!-#*',%#-#.(!$%&$#%(/#'!

D,11#'(!()*(!4&.(#.(!5*0/3/(7!-*7!6#!-&%#!/-$&%(*.(!(&!40/./-#(%/4'!6,(!/(!/'!/-$&%(*.(!/.!6&()!2/#03'!*.3!()*(!/.(#%.*0!4&.'/'(#.47!-*7!6#!/%%#0#5*.(!(&!40/./-#(%/4!'4*0#'!6,(!5*0/3/(7!*.3!%#0/*6/0/(7!.##3!(&!6#!*''#''#3!2&%!*00!/.'(%,-#.('!

D,11#'(!()*(!40/./-#(%/4'!,'#'!-*.7!$%&4#3,%#'!3#5#0&$#3!/.!$'74)&-#(%/4'!

M%1,#!()*(!',/(*6/0/(7!&2!'(*(/'(/4*0!$%&4#3,%#'!.##3!(&!6#!4&.'/3#%#3!/.'(#*3!&2!R,'(!*$$0/#3!6#4*,'#!&2!2*-/0/*%/(7!8)#.!3#5#0&$/.1!.#8!'4*0#'!2&%!*''#''-#.(!&2!#.5/%&.-#.('!

J&.#!$%#'#.(#3! E%/(/?,#3!*.!*%(/40#!5*0/3*(/.1!*!-#*',%#!*''#''/.1!#.5/%&.-#.('!(&!3/'4,''!3/22#%#.4#'!6#(8##.!40/./-#(%/4'!*.3!$'74)&-#(%/4'!

D,$$&%(!2&%!3/22#%#.4#!/.!()#!3#5#0&$-#.(!'(*1#'!&2!&,%!2%*-#8&%P!*.3!()#!(7$#!&2!*''#''-#.('!$#%2&%-#3!/.!&,%!-#*',%#-#.(!$#%2&%-*.4#!'(*1#!

D,$$&%(!2&%!()#!/.2&%-#3!=&.#!'/.4#!40/./-#(%/4'!,'#'!-*.7!&2!()#!$%&4#3,%#'!3#5#0&$#3!/.!$'74)&#-(%/4'!

30


! "#$%&%#'!$&(&)*)'&!#+!#,-!.-(*)/#-0!! !

1,--)'&!23)$%$!

"#!$%&$&'#!()*(!+#*',%#-#.(!/'!*!0*%1#!2/#03!()*(!4&.'/'('!&2!'#5#%*0!',62/#03'!/.40,3/.1!40/./-#(%/4'!*.3!$'74)&-#(%/4'!8)/4)!-*7!&5#%0*$!/.!'&-#!*%#*'!*.3!6#!3/'(/.4(!/.!&()#%'9!!"#!$%&$&'#!()*(!()#!3/22#%#.4#'!6#(8##.!40/./-#(%/4'!*.3!$'74)&-#(%/4'!*%#!/.!()#!3#5#0&$-#.(!&2!&,(4&-#!-#*',%#'!*.3!.&(!/.!()#!-#*',%#-#.(!$#%2&%-*.4#!'(*1#9!!:,%()#%-&%#;!8#!$%&$&'#!()*(!()#%#!/'!*.!&5#%0*$!<&%!*.!/.2&%-#3!=&.#>!/.!()#!,'#!&2!3#5#0&$-#.(!(#4)./?,#'!6#(8##.!()#!2/#03!'$#4/2/4!(#4)./?,#'!*(!#/()#%!#@(%#-#9!

! !4)+%')5!678! 9,::#-&!+#-!#,-!$&(&)*)'&! 1#'&-(5%;&#-7!&#!#,-!$&(&)*)'&! 1#**)'&$!4%(*#'5!CXXL!<%#$07!(&!I/RP#%'>!

T'74)&-#(%/4/*.'!*%#!R,'(!*'!/.(#%#'(#3!/.!4&-6/./.1!-,0(/$0#!-#*',%#'!/.(&!*!'/.10#!'4&%#!*'!*%#!40/./4/*.'!*.3!()#!/-$&%(*.(!1&*0!/.!$'74)&-#(%/4'!/'!()#!#'(/-*(/&.!&2!()#!5*0/3/(7!&2!/.2#%#.4#'!-*3#!2%&-!-#*',%#-#.(';!.&(!R,'(!()#!#'(/-*(/&.!&2!%#0/*6/0/(7!

K*%07!'(*1#'!&2!*.7!'#*%4)!2&%!5*0/3!/.2#%#.4#'!')&,03!/.40,3#!4&.(#.(!#@$#%('!&2!()#!'$#4/2/4!2/#03!&2!/.(#%#'(!</9#9!40/./4/*.!2&%!)#*0()!4*%#>;!.&(!R,'(!$'74)&-#(%/4/*.'!<4&.'/3#%#3!(&!6#!?,*.(/(*(/5#>!

J&.#!$%#'#.(#3! M%1,#'!*1*/.'(!()#!3/5/3#!&2!I/RP#%'!#(!*0!<CXXC>!67!*%1,/.1!/.!',$$&%(!&2!&,%!/.2&%-#3!=&.#!67!6%/.1/.1!()#!'(*P#)&03#%!&$/./&.!/.(&!()#!3#5#0&$-#.(!&2!*!$'74)&-#(%/4!'4*0#!

@(-%#'!CXXL!<%#$07!(&!I/RP#%'>!

D,11#'(!()*(!3#5#0&$-#.(!&2!-#*',%/.1!/.'(%,-#.('!4*.!6#!3&.#!8/()!.&!&%!5#%7!2#8!-*()#-*(/4*0!&%!()#&%#(/4*0!(&&0'!*'!0&.1!*'!()#!3#5#0&$#%'!3#2/.#!8)*(!/'!6#/.1!-#*',%#3!

M1%##'!8/()!I/RP#%'!()*(!*''#''-#.(!&2!()#!-#*',%#W'!,(/0/(7;!4&.'/'(#.47!*.3!-#*./.1!3&#'!.&(!%#?,/%#!#0*6&%*(#!*.*07'/'!&%!-&3#0'!<R,'(!40/./4*0!R,31-#.(>!

J&.#!$%#'#.(#3! D,$$&%('!3/22#%#.4#!6#(8##.!-#()&3'!/.!&,%!3#5#0&$-#.(!'(*1#'!/.!',11#'(/.1!()*(!40/./4*0!/.(,/(/&.!')&,03!$0*7!*!'/1./2/4*.(!%&0#!/.!3#5#0&$/.1!40/./4*0!/.3/4#'!

31


! "#$%&%#'!$&(&)*)'&!#+!#,-!.-(*)/#-0!! !

1,--)'&!23)$%$!

"#!$%&$&'#!()*(!+#*',%#-#.(!/'!*!0*%1#!2/#03!()*(!4&.'/'('!&2!'#5#%*0!',62/#03'!/.40,3/.1!40/./-#(%/4'!*.3!$'74)&-#(%/4'!8)/4)!-*7!&5#%0*$!/.!'&-#!*%#*'!*.3!6#!3/'(/.4(!/.!&()#%'9!!"#!$%&$&'#!()*(!()#!3/22#%#.4#'!6#(8##.!40/./-#(%/4'!*.3!$'74)&-#(%/4'!*%#!/.!()#!3#5#0&$-#.(!&2!&,(4&-#!-#*',%#'!*.3!.&(!/.!()#!-#*',%#-#.(!$#%2&%-*.4#!'(*1#9!!:,%()#%-&%#;!8#!$%&$&'#!()*(!()#%#!/'!*.!&5#%0*$!<&%!*.!/.2&%-#3!=&.#>!/.!()#!,'#!&2!3#5#0&$-#.(!(#4)./?,#'!6#(8##.!()#!2/#03!'$#4/2/4!(#4)./?,#'!*(!#/()#%!#@(%#-#9!

! !4)+%')5!678! 9,::#-&!+#-!#,-!$&(&)*)'&! 1#'&-(5%;&#-7!&#!#,-!$&(&)*)'&! 1#**)'&$!5)!F)&!)&!(>!CXXL!$9AALH!

E&.'/3#%!40/./-#(%/4'!*.3!$'74)&-#(%/4'!*'!3/'(/.4(!*.3!%#0*(#3!2/#03'!6,(!,'#!V40/./-#(%/4!$%&$#%(/#'W!*'!*!(#%-!2&%!-#*',%#-#.(!$%&$#%(/#'!/.40,3/.1!()&'#!$%#5/&,'07!,'#3!/.!$'74)&-#(%/4'!</9#9!5*0/3/(7;!%#0/*6/0/(7>!

D,11#'(!()*(!-#*',%#'!3#5#0&$#3!,'/.1!NYG!*%#!0#''!3#$#.3#.(!&.!$&$,0*(/&.!*.3!'/(,*(/&.'!6,(!()*(!NYG!,'#!/.!40/./-#(%/4'!-*7!6#!0/-/(#3!6#4*,'#!/(!*/-'!(&!3#5#0&$!,./3/-#.'/&.*0!/.'(%,-#.('!

D,11#'(!()*(!()#!*/-!&2!()#!-#*',%#!3#(#%-/.#'!8)/4)!-#*',%#-#.(!$%&$#%(/#'!*%#!/-$&%(*.(!

D,11#'(!()*(!*!40&'#!4&00*6&%*(/&.!6#(8##.!40/./4/*.';!'(*(/'(/4/*.';!#$/3#-/&0&1/'('!*.3!$'74)&0&1/'(!*%#!.##3#3!(&!/-$%&5#!2,(,%#!,'#!&2!40/./-#(%/4'!/.!40/./4*0!%#'#*%4)!*.3!$%*4(/4#!

E0*/-!()*(!*00!-#*',%#'!/.!40/./4*0!%#'#*%4)!*%#!40/./-#(%/4!

D,11#'(!()*(!2*4#!5*0/3/(7!/'!-&%#!/-$&%(*.(!/.!40/./-#(%/4'!!

D,$$&%(!2&%!()#!3/22#%#.4#!/.!()#!3#5#0&$-#.(!*.3!/.2&%-#3!=&.#!67!',11#'(/.1!()*(!#@$#%('!2%&-!3/22#%#.(!2/#03'!.##3#3!/.!40/./-#(%/4'!6,(!*0'&!()#!$#%2&%-*.4#!'(*1#!67!',11#'(/.1!()*(!()#!*/-!&2!()#!-#*',%#!3#(#%-/.#'!()#!(7$#!&2!-#*',%#!#5*0,*(/&.;!.&(!()#!-#()&3!&2!3#5#0&$-#.(!

32


! "#$%&%#'!$&(&)*)'&!#+!#,-!.-(*)/#-0!! !

1,--)'&!23)$%$!

"#!$%&$&'#!()*(!+#*',%#-#.(!/'!*!0*%1#!2/#03!()*(!4&.'/'('!&2!'#5#%*0!',62/#03'!/.40,3/.1!40/./-#(%/4'!*.3!$'74)&-#(%/4'!8)/4)!-*7!&5#%0*$!/.!'&-#!*%#*'!*.3!6#!3/'(/.4(!/.!&()#%'9!!"#!$%&$&'#!()*(!()#!3/22#%#.4#'!6#(8##.!40/./-#(%/4'!*.3!$'74)&-#(%/4'!*%#!/.!()#!3#5#0&$-#.(!&2!&,(4&-#!-#*',%#'!*.3!.&(!/.!()#!-#*',%#-#.(!$#%2&%-*.4#!'(*1#9!!:,%()#%-&%#;!8#!$%&$&'#!()*(!()#%#!/'!*.!&5#%0*$!<&%!*.!/.2&%-#3!=&.#>!/.!()#!,'#!&2!3#5#0&$-#.(!(#4)./?,#'!6#(8##.!()#!2/#03!'$#4/2/4!(#4)./?,#'!*(!#/()#%!#@(%#-#9!

! !4)+%')5!678! 9,::#-&!+#-!#,-!$&(&)*)'&! 1#'&-(5%;&#-7!&#!#,-!$&(&)*)'&! 1#**)'&$!9&-)%')-!)&!(>!CXXL!

D,11#'(!()*(!R,'(!*'!-,4)!V3/''#4(#3!/.(,/(/&.W!/'!,'#3!/.!$'74)&-#(%/4'!*'!/.!40/./-#(%/4'!

Y#4&1./=#!()*(!()#%#!/'!*!3/22#%#.4#!()*(!()#7!3#'4%/6#!*'!-&%#!?,*0/(*(/5#!6*'#3!&.!)&8!-,4)!3#$#.3#.4#!()#%#!/'!&.!,'/.1!&.07!40/./4*0!R,31#-#.(!/.!/(#-!'#0#4(/&.!<',11#'(/.1!()*(!/(!/'!)/1)!/.!40/./-#(%/4'!*.3!0&8!/.!$'74)&-#(%/4'>!

M%1,#!*1*/.'(!()#!',11#'(/&.!()*(!*00!?,#'(/&..*/%#'!/.!$'74)&-#(%/4'!*%#!,./3/-#.'/&.*0!*.3!()*(!*00!-#3/4*0!&.#'!*%#!)#(#%&1#.#&,'!6#4*,'#!()*(!8&,03!&5#%0&&P!()#!3/5#%'/(7!&2!6&()!2/#03'!

D,11#'(!()*(!()#!,'#!&2!()#!(#%-!40/./-#(%/4'!)*'!0#*3!(&!/1.&%*.4#!&2!()#!0/(#%*(,%#!&.!-#*',%#-#.(!$%&$#%(/#'!()*(!8*'!3#5#0&$#3!/.!$'74)&-#(%/4'!*.3!()/'!',$$&%('!&,%!'#4(/&.!&.!-#*',%#-#.(!$%&$#%(/#'!/.!()*(!()/'!*%#*!')&,03!.&(!3/22#%!6#(8##.!2/#03'!*.3!()*(!/.2&%-*(/&.!2%&-!6&()!2/#03'!')&,03!6#!,'#3!

D,11#'(!()*(!'4*0#'!()*(!')&,03!6#!)&-&1#.#&,'!')&,03!6#!#5*0,*(#3!8/()!*.!/.3#@!&2!/.(#%.*0!4&.'/'(#.47!6,(!()*(!.&(!*00!'4*0#'!*%#!)&-&1#.#&,'!!

D,11#'(!()*(!/.(#%.*0!4&.'/'(#.47!*.3!/.3/4#'!&2!%#$%&3,4/6/0/(7!</9#9!(#'(Q%#(#'(!&%!/.(#%Q%*(#%!%#0/*6/0/(7>!*%#!.&(!*0(#%.*(/5#'!6,(!()*(!()#7!-#*',%#!3/22#%#.(!$#%2&%-*.4#!*'$#4('!&2!*!'4*0#!*.3!()*(!6&()!4&,03!6#!,'#3!/.!6&()!2/#03'!

D,11#'(!()*(!()#!,'#!&2!()#!(#%-!40/./-#(%/4'!)*'!0#*3!(&!*!-/',.3#%'(*.3/.1!&2!()#!$%&4#3,%#'!,'#3!(&!3#5#0&$!'4*0#'!

D,11#'(!()*(!()#!2/#03!&2!40/./-#(%/4'!')&,03!.&(!#@/'(!*'!*!'#$*%*(#!2/#03!2%&-!$'74)&-#(%/4'!6#4*,'#!/(!/'!&.07!*!',6'#(!&2!$'74)&-#(%/4'!8/()!4&.4#$('!()*(!3&!.&(!3/22#%!',6'(*.(/*007!

E&.40,'/&.Z![40/./-#(%/4'!/'!.&(!3#'4%/6/.1!*!.#8!2*-/07!&2!(#4)./?,#'!()*(!')&,03!6#!,'#3!8/()!*!,./?,#!(7$#!&2!'4*0#;!6,(!/'!'/-$07!*.&()#%!8&%3!2&%!*!$&%(/&.!&2!8)*(!/'!3&.#!/.!$'74)&-#(%/4'\!<$$9!AA]]>!

D(%#/.#%!$%#'#.('!*!'#*%4)!'(%*(#17!(&!')&8!()*(!-&'(!-#*',%#-#.(!$%&$#%(7!*%(/40#'!4*.!6#!2&,.3!8/()!$'74)&-#(%/4'!*.3!0#''!8/()!40/./-#(%/4'!$&/.(/.1!&,(!()*(!()/'!5*'(!/.2&%-*(/&.!4*.!6#!-/''#3!67!'#*%4)/.1!2&%!40/./-#(%/4'!*0&.#!*.3!()*(!/1.&%*.4#!*.3!-/',.3#%'(*.3/.1!4*.!%#',0(!6,(!8#!8&,03!',11#'(!(&!1#.#%*0/=#!()#!(#%-/.&0&17!(&!/.40,3#!*00!0/(#%*(,%#!67!/.3#@/.1!()#-!/.!3*(*6*'#'!*'!-#*',%#-#.(!$%&$#%(/#'!<.&(!40/./-#(%/4!&%!$'74)&-#(%/4!&%!*.7!&()#%!-#*',%#-#.(!',62/#03>!

D,$$&%(/5#!&2!&,%!3/*1%*-!/.!()#!'#.'#!()*(!.&(!*00!40/./-#(%/4!-#*',%#'!*%#!)#(#%&1#.&,'!*.3!.&(!*00!$'74)&-#(%/4!&.#'!*%#!,./3/-#.'/&.*0!^!()#%#!/'!*!(%*.'/(/&.!6#(8##.!()#!2/#03'!<',$$&%(!2&%!&,%!/.2&%-#3!=&.#>!

_%*3/.1!&2!()#!,'#!&2!40/./4*0!R,31#-#.(!/.!3#5#0&$/.1!-#*',%#'!/'!*0'&!',$$&%(/5#!&2!&,%!/.2&%-#3!=&.#!

33


! "#$%&%#'!$&(&)*)'&!#+!#,-!.-(*)/#-0!! !

1,--)'&!23)$%$!

"#!$%&$&'#!()*(!+#*',%#-#.(!/'!*!0*%1#!2/#03!()*(!4&.'/'('!&2!'#5#%*0!',62/#03'!/.40,3/.1!40/./-#(%/4'!*.3!$'74)&-#(%/4'!8)/4)!-*7!&5#%0*$!/.!'&-#!*%#*'!*.3!6#!3/'(/.4(!/.!&()#%'9!!"#!$%&$&'#!()*(!()#!3/22#%#.4#'!6#(8##.!40/./-#(%/4'!*.3!$'74)&-#(%/4'!*%#!/.!()#!3#5#0&$-#.(!&2!&,(4&-#!-#*',%#'!*.3!.&(!/.!()#!-#*',%#-#.(!$#%2&%-*.4#!'(*1#9!!:,%()#%-&%#;!8#!$%&$&'#!()*(!()#%#!/'!*.!&5#%0*$!<&%!*.!/.2&%-#3!=&.#>!/.!()#!,'#!&2!3#5#0&$-#.(!(#4)./?,#'!6#(8##.!()#!2/#03!'$#4/2/4!(#4)./?,#'!*(!#/()#%!#@(%#-#9!

! !4)+%')5!678! 9,::#-&!+#-!#,-!$&(&)*)'&! 1#'&-(5%;&#-7!&#!#,-!$&(&)*)'&! 1#**)'&$!5)!F)&!)&!(>!!CXXL!$9AA]`!<%#$07!(&!D(%#/.#%>!

+#*',%#-#.(!3#2/.#3!*'!W-#(%/4'W!()*(!4*.!6#!',62/#03'!&2!$'74)&-#(%/4';!40/./-#(%/4';!6/&-#(%/4'!

D,11#'(!()*(!$'74)&-#(%/4'!*.3!40/./-#(%/4'!*%#!.&(!4&.(%*3/4(&%7!6,(!)*5#!3/22#%#.(!*/-'!8)/4)!*%#!-&%#!4&.(#.(!3%/5#.!2&%!40/./-#(%/4'!</9#9!-#*',%#!-,0(/$0#!4&.'(%,4('!8/()!*!'/.10#!/.3#@!>!*.3!-&%#!'(*(/'(/4*007!3%/5#.!2&%!$'74)&-#(%/4'!</9#9!-#*',%#!*!'/.10#!4&.'(%,4(!,'/.1!-,0(/$0#!/(#-'>!

D,11#'(!()*(!3/22#%#.4#'!*%#!-&'(07!#5/3#.(!/.!3#5#0&$-#.(!'(*1#'!8/()!$'74)&-#(%/4!-#*',%#-#.(!/.'(%,-#.('!/.40,3/.1!&.07!/.3/4*(&%!5*%/*60#'!<()*(!4&%%#0*(#!8/()!()#!,.3#%07/.1!4&.'(%,4(!(&!6#!-#*',%#3!6,(!3&!.&(!*0(#%!&%!/.20,#.4#!()#!4&.'(%,4(;!.&(!4*,'*0>!8)/0#!40/./-#(%/4!-#*',%#-#.(!/.'(%,-#.('!-*7!/.40,3#!/.3/4*(&%!*.3!4*,'*0!5*%/*60#'!

D,11#'(!()*(!()#!3/22#%#.4#'!6#(8##.!40/./-#(%/4!*.3!$'74)&-#(%/4!*$$%&*4)#'!*%#!0#''!&65/&,'!/.!()#!#5*0,*(/&.!&2!()#!&,(4&-#!-#*',%#!

D,11#'(!()*(!6&()!-#(%/4!3/'4/$0/.#'!-*7!,'#!()#!'*-#!-#()&3&0&1/4!*.3!'(*(/'(/4*0!*$$%&*4)#';!3#$#.3/.1!&.!()#!1&*0!*.3!',6R#4(!&2!-#*',%#-#.(!

J&.#!$%#'#.(#3! Y#$07!(&!D(%#/.#%!$&/.(/.1!&,(!3/22#%#.4#'!6#(8##.!40/./-#(%/4'!*.3!$'74)&-#(%/4'!/.!',$$&%(!()*(!()#7!')&,03!6#!3/22#%#.(!6,(!-*P#!,'#!&2!-#()&3'!2%&-!&.#!*.&()#%!8)#.!%#?,/%#3!67!()#!1&*0!&2!()#!-#*',%#!

a0#.3#3!*$$%&*4)#'!*.3!&,%!/.2&%-#3!=&.#!',$$&%(#3!!

34


! "#$%&%#'!$&(&)*)'&!#+!#,-!.-(*)/#-0!! !

1,--)'&!23)$%$!

"#!$%&$&'#!()*(!+#*',%#-#.(!/'!*!0*%1#!2/#03!()*(!4&.'/'('!&2!'#5#%*0!',62/#03'!/.40,3/.1!40/./-#(%/4'!*.3!$'74)&-#(%/4'!8)/4)!-*7!&5#%0*$!/.!'&-#!*%#*'!*.3!6#!3/'(/.4(!/.!&()#%'9!!"#!$%&$&'#!()*(!()#!3/22#%#.4#'!6#(8##.!40/./-#(%/4'!*.3!$'74)&-#(%/4'!*%#!/.!()#!3#5#0&$-#.(!&2!&,(4&-#!-#*',%#'!*.3!.&(!/.!()#!-#*',%#-#.(!$#%2&%-*.4#!'(*1#9!!:,%()#%-&%#;!8#!$%&$&'#!()*(!()#%#!/'!*.!&5#%0*$!<&%!*.!/.2&%-#3!=&.#>!/.!()#!,'#!&2!3#5#0&$-#.(!(#4)./?,#'!6#(8##.!()#!2/#03!'$#4/2/4!(#4)./?,#'!*(!#/()#%!#@(%#-#9!

! !4)+%')5!678! 9,::#-&!+#-!#,-!$&(&)*)'&! 1#'&-(5%;&#-7!&#!#,-!$&(&)*)'&! 1#**)'&$!D);3!CXX]!

E&.4&%3*.4#!6#(8##.!40/./4*0!*.3!'(*(/'(/4*0!4&)#%#.4#!&2!'7-$(&-'!/'!#''#.(/*0!(&!40/./-#(%/4'!*.3!)*'!6##.!3#-&.'(%*(#3!/.!*!')&%(#.#3!3#$%#''/&.!?,#'(/&..*/%#;!()#!SM+QI`!<8/()!/-$%&5#3!%#'$&.'/5#.#''>!

E0*''/4*0!$'74)&-#(%/4'!4*.!%#',0(!/.!*.!*0(#%.*(/5#!(&!40/./4*0!()/.P/.1!8)/4)!/'!.&(!%#4&--#.3#3!6,(!-&3#%.!$'74)&-#(%/4'!/'!-&%#!/.(#1%*(/5#!

E%&.6*4)W'!*0$)*!/'!3#$#.3#.(!&.!()#!.,-6#%!&2!/(#-'!/.!()#!?,#'(/&..*/%#!<8)/4)!-*P#'!/(!*.!/.3/4*(&%!&2!0#.1()!&2!()#!-#*',%#>!*.3!.&(!%#0#5*.(!/.!40/./-#(%/4'!

I/'4,''!40/./-#(%/4'!*.3!$'74)&-#(%/4'!*'!()#7!%#0*(#!(&!$'74)/*(%7!

D,$$&%(!2&%!()#!/.2&%-#3!=&.#!6#4*,'#!40/./4*0!*.3!'(*(/'(/4*0!-#()&3'!.##3!(&!6#!/.!4&.4&%3*.4#!2&%!6#((#%!-#*',%#!3#5#0&$-#.(!*.3!-&3#%.!$'74)&-#(%/4'!/'!-&%#!/.(#1%*(/5#!/.!*1%##-#.(!8/()!()*(!

G**)>0(*:!CXX]!

M0()&,1)!40/./-#(%/4'!)*'!/.(%&3,4#3!'#.'/(/5/(7!(&!4)*.1#!*'!*.!/-$&%(*.(!-#*',%#-#.(!4&.4#$(;!$'74)&-#(%/4!-#()&3'!')&,03!.&(!6#!/1.&%#3!

Y#0/*6/0/(7!*.3!5*0/3/(7;!/.40,3/.1!2,.4(/&.*0!*.*07'/';!')&,03!6#!/.5#'(/1*(#3!%*()#%!()*.!*'',-#3!

J&.#!$%#'#.(#3! I/'4,''!40/./-#(%/4'!*.3!$'74)&-#(%/4'!*'!()#7!%#0*(#!(&!$'74)/*(%7!

D,$$&%(!2&%!()#!$#%2&%-*.4#!-#*',%#-#.(!'(*1#!6#4*,'#!*00!-#*',%#'!.##3!(&!6#!*''#''#3!,'/.1!()#!'*-#!*''#''-#.(!-#()&3'!

.(H(!)&!(>!CXX]!

T'74)&-#(%/4!(#4)./?,#'!')&,03!6#!,'#3!*'!$*%(!&2!40/./-#(%/4'!6,(!,'/.1!$'74)&-#(%/4'!*0&.#!4&,03!0#*3!(&!-/'0#*3/.1!#22#4('!/.!40/./4*0!%#'#*%4)!

E0/./-#(%/4!-#()&3'!3/22#%!2%&-!$'74)&-#(%/4!-#()&3'!/.!()#!3#5#0&$-#.(!&2!*!'4*0#!

!

J&.#!$%#'#.(#3! I/'4,''#'!()#!,'#!&2!40/./-#(%/4'!/.!$'74)/*(%7!

D,$$&%(!2&%!3/22#%#.4#!/.!3#5#0&$-#.(!&2!'4*0#'!6,(!',11#'(!(&!/.(#1%*(#!-#()&3'!&2!&()#%!2/#03'!</.!',$$&%(!&2!&,%!/.2&%-#3!=&.#>!

35


! "#$%&%#'!$&(&)*)'&!#+!#,-!.-(*)/#-0!! !

1,--)'&!23)$%$!

"#!$%&$&'#!()*(!+#*',%#-#.(!/'!*!0*%1#!2/#03!()*(!4&.'/'('!&2!'#5#%*0!',62/#03'!/.40,3/.1!40/./-#(%/4'!*.3!$'74)&-#(%/4'!8)/4)!-*7!&5#%0*$!/.!'&-#!*%#*'!*.3!6#!3/'(/.4(!/.!&()#%'9!!"#!$%&$&'#!()*(!()#!3/22#%#.4#'!6#(8##.!40/./-#(%/4'!*.3!$'74)&-#(%/4'!*%#!/.!()#!3#5#0&$-#.(!&2!&,(4&-#!-#*',%#'!*.3!.&(!/.!()#!-#*',%#-#.(!$#%2&%-*.4#!'(*1#9!!:,%()#%-&%#;!8#!$%&$&'#!()*(!()#%#!/'!*.!&5#%0*$!<&%!*.!/.2&%-#3!=&.#>!/.!()#!,'#!&2!3#5#0&$-#.(!(#4)./?,#'!6#(8##.!()#!2/#03!'$#4/2/4!(#4)./?,#'!*(!#/()#%!#@(%#-#9!

! !4)+%')5!678! 9,::#-&!+#-!#,-!$&(&)*)'&! 1#'&-(5%;&#-7!&#!#,-!$&(&)*)'&! 1#**)'&$!D)(&#'!)&!(>!!CXXb!

E&-$*%#3!()%##!/(#-Q%#3,4(/&.!*$$%&*4)#'!(&!4%#*(#!?,/4PIMDS!*.3!2&,.3!()*(!()#!4&.4#$(Q%#(#.(/&.!<R,31-#.(Q6*'#3>!*$$%&*4)!$%&3,4#3!*!4&-$*%*60#;!*.3!'0/1)(07!6#((#%!-#*',%#!()*.!'(*(/'(/4*007Q3%/5#.!*$$%&*4)#'!

!

J&.#!$%#'#.(#3! K-$/%/4*0!#5*0,*(/&.!&2!()#!40/./-#(%/4!*.3!$'74)&-#(%/4!3#5#0&$-#.(!&2!*!')&%(#%!5#%'/&.!&2!()#!IMDS!/.!()#!'*-#!$&$,0*(/&.!*'!+*%@!#(!*0!*''#''#3!()#!&%/1/.*0!IMDS!

G)#!40/./-#(%/4!-#*',%#!)*3!()#!'(%&.1#'(!%*.P/.1!/.!(#%-'!&2!-#*',%#-#.(!$%&$#%(/#'!6,(!5#%7!'/-/0*%;!&2(#.!*'',-#3!(&!3#$#.3!&.!'&0/3!$'74)&-#(%/4!2&,.3*(/&.'!!

I%6)-(!)&!(>!CXX`!

E0/./-#(%/4!*.3!$'74)&-#(%/4!*$$%&*4)#'!(&!3&-*/.!3#5#0&$-#.(!4&-$*%#3!2&%!*!-7&4*%3/*0!/.2%*4(/&.!?,#'(/&..*/%#!*.3!2&,.3!()*(!()#!40/./-#(%/4*007!'4&%#3!5#%'/&.!)*3!6#((#%!&5#%*00!-#*',%#-#.(!$%&$#%(/#'!*.3!')&,03!6#!,'#3!,.(/0!*!6#((#%!$'74)&-#(%/4*007!'4&%#3!5#%'/&.!/'!3#5#0&$#3!

+#()&3'!')&,03!6#!/.(#1%*(#3!</9#9!/2!2*4(&%!*.*07(/4!(#4)./?,#'!1/5#!*-6/1,&,'!%#',0('!/.!3/22#%#.(!3*(*'#(';!40/./-#(%/4'!-*7!)#0$>!

E0/./-#(%/4!*.3!$'74)&-#(%/4!*$$%&*4)#'!(&!3#5#0&$/.1!-,0(/Q/(#-!SYcUd!/.'(%,-#.('!-*7!6#!4&-$0/-#.(*%7!6#4*,'#!()#!2/%'(!/-$%&5#'!5*0/3/(7!*.3!%#'$&.'/5#.#''!8)/0#!()#!'#4&.3!/-$%&5#'!%#0/*6/0/(7!

K-$/%/4*0!#5*0,*(/&.!&2!()#!40/./-#(%/4!*.3!$'74)&-#(%/4!3#5#0&$-#.(!&2!-#*',%#!3&-*/.'!<.&(!/(#-!'#0#4(/&.>!

D,$$&%(!2&%!()#!/.2&%-#3!=&.#!6#4*,'#!',11#'(!()*(!()#'#!%#',0('!',11#'(!()#!/.(#1%*(/&.!&2!()#!(8&!*$$%&*4)#'!/.!3#5#0&$-#.(!*.3!#5*0,*(/&.!&2!-#*',%#'!-/1)(!6#!()#!6#'(!*$$%&*4)!

D,11#'(!()*(!%#',0('!&2!-#*',%#-#.(!$%&$#%(/#'!3/22#%!6*'#3!&.!3#5#0&$-#.(!-#()&3!6,(!()*(!()#!'*-#!$%&$#%(/#'!')&,03!6#!*''#''#3!

36


! "#$%&%#'!$&(&)*)'&!#+!#,-!.-(*)/#-0!! !

1,--)'&!23)$%$!

"#!$%&$&'#!()*(!+#*',%#-#.(!/'!*!0*%1#!2/#03!()*(!4&.'/'('!&2!'#5#%*0!',62/#03'!/.40,3/.1!40/./-#(%/4'!*.3!$'74)&-#(%/4'!8)/4)!-*7!&5#%0*$!/.!'&-#!*%#*'!*.3!6#!3/'(/.4(!/.!&()#%'9!!"#!$%&$&'#!()*(!()#!3/22#%#.4#'!6#(8##.!40/./-#(%/4'!*.3!$'74)&-#(%/4'!*%#!/.!()#!3#5#0&$-#.(!&2!&,(4&-#!-#*',%#'!*.3!.&(!/.!()#!-#*',%#-#.(!$#%2&%-*.4#!'(*1#9!!:,%()#%-&%#;!8#!$%&$&'#!()*(!()#%#!/'!*.!&5#%0*$!<&%!*.!/.2&%-#3!=&.#>!/.!()#!,'#!&2!3#5#0&$-#.(!(#4)./?,#'!6#(8##.!()#!2/#03!'$#4/2/4!(#4)./?,#'!*(!#/()#%!#@(%#-#9!

! !4)+%')5!678! 9,::#-&!+#-!#,-!$&(&)*)'&! 1#'&-(5%;&#-7!&#!#,-!$&(&)*)'&! 1#**)'&$!2,-')-!)&!(>!CXXB!

E&-$*%/.1!40/./-#(%/4!*.3!$'74)&-#(%/4!-#()&3'!(&!*''/1.!8#/1)('!(&!*!$#3/*(%/4!1*'(%&#.(#%&0&17!/.3#@!*.3!2&,.3!()*(!3/22#%#.(!/(#-'!8#%#!%#(*/.#3!

+*()#-*(/4*0!*$$%&*4)!(&!8#/1)(/.1!()#!/.3#@!8*'!',$#%/&%!(&!()#!R,31#-#.(*0!*$$%&*4)!6#4*,'#!&2!2#*'/6/0/(7!</9#9!0*6!/(#-'!&-/((#3>!

+#*',%#-#.(!$%&$#%(/#'!8#%#!#?,*007!1&&3!2&%!6&()!-#()&3'!<#5#.!()&,1)!$'74)&-#(%/4'!/'!&2(#.!4%/(/4/=#3!2&%!*!0*4P!&2!2*4#!5*0/3/(7!*.3!'#.'/6/0/(7>!

Y#',0('!',11#'(!()*(!*!I#0$)/!(#4)./?,#!-*7!.&(!6#!*60#!(&!',6'(/(,(#!0*%1#!#@$#.'/5#!$%&'$#4(/5#!'(,3/#'!.##3#3!2&%!8#/1)(/.1!-#*',%#'!,'/.1!$'74)&-#(%/4!-#()&3'!</9#9!#@40,'/&.!&2!0*6&%*(&%7!/(#-'!4%#*(#3!*!-&%#!2#*'/60#!-#*',%#!()*(!8&,03!6#!?,/4P#%!(&!4&-$0#(#>!

K-$/%/4*0!#5*0,*(/&.!&2!()#!40/./-#(%/4!*.3!$'74)&-#(%/4!8#/1)(/.1!&2!-#*',%#'!

D,11#'(!()*(!*!()&,1)(2,0!4&-6/.*(/&.!&2!6&()!-*()#-*(/4*0!*.3!40/./4*0!-#()&3'!')&,03!6#!,'#3!/.!3#5#0&$-#.(!&2!6&()!40/./-#(%/4!*.3!$'74)&-#(%/4!-#*',%#'!</.!',$$&%(!&2!&,%!/.2&%-#3!=&.#>!

.(H(!)&!(>!CXAC!!

D,11#'(!()*(!)&-&1#.#/(7!&2!4&-$&.#.('!/'!.&(!%#?,#'(#3!2&%!40/./-#(%/4'!*'!/(!-*7!6#!2&%!$'74)&-#(%/4'!

D,11#'(!()*(!8#/1)/.1!&2!/.3/5/3,*0!/(#-'!-*7!!3/22#%!*'!0&.1!*'!()#!-#*',%#!/'!*60#!(&!3/'4%/-/.*(#!6#(8##.!3/22#%#.(!1%&,$'!&2!',6R#4('!*.3!(&!%#20#4(!4)*.1#'!/.!#@$#%/-#.(*0!'#((/.1'!</9#9!3%,1!(%/*0'>!

N.'/.,*(#!()*(!40/./-#(%/4'!)*'!*!'#(!&2!%,0#'!()*(!3/22#%'!2%&-!$'74)&-#(%/4'!/.!3#5#0&$-#.(!'(*1#'!

D,11#'(!()*(!40/./-#(%/4!$%/.4/$0#'!')&,03!1,/3#!()#!'#0#4(/&.!&2!-#()&3'!2&%!*!'$#4/2/4!*''#''-#.(!6,(!3&!.&(!3#'4%/6#!$%/.4/$0#'!()*(!3/22#%!2%&-!$'74)&-#(%/4'!/.!()#!#5*0,*(/&.!&2!$%&$#%(/#'!

D,11#'(!()*(!10&6*0!%*(/.1!*.3!/.5&05#-#.(!&2!()#!$*(/#.(!*%#!,./?,#!2#*(,%#'!&2!40/./-#(%/4!'4*0#'!

N.'/.,*(#!()*(!40/./-#(%/4'!)*'!/('!&8.!'#(!&2!%,0#'!/.!#5*0,*(/&.!'(*1#'!

I&-*/.'!/.40,3#3!/.!V40/./-#(%/4W!0*(#.(!(%*/('!3#'4%/6#3!*%#!(%*3/(/&.*007!-&%#!$'74)&-#(%/4!</9#9!$'74)&'&4/*0!4)*%*4(#%/'(/4'>!

G)#!(#%-!40/./-#(%/4!,'#3!2&%!*00!-#*',%#'!%#0#5*.(!(&!40/./4*0!$%*4(/4#!

+#*',%#-#.(!$%&$#%(/#'!0*6#0#3!40/./-#(%/4!$%&$#%(/#'!</1.&%/.1!$'74)&-#(%/4!4&.(%/6,(/&.'>!

D,$$&%(!2&%!()#!3/22#%#.4#!/.!3#5#0&$-#.(!'(*1#'!&2!&,%!2%*-#8&%P!*.3!()#!-#*',%#-#.(!$#%2&%-*.4#!'(*1#!'/.4#!*00!()#!'*-#!*''#''-#.(!-#()&3'!*%#!%#4&--#.3#3!6,(!%#.*-/.1!&2!()#!$'74)&-#(%/4!$%&$#%(/#'!*'!40/./-#(%/4!/'!.&(!',$$&%(/5#!&2!()#!2%*-#8&%P!

37

Table 2.2a: Studies using empirical methods to test differences between clinimetric and psychometric methods !D(,37! N.'(%,-#.(! E&-$*%/'&.!&2!3#5#0&$-#.(!-#()&3'! Y#',0('!e,./$#%!#(!*0;!ABBH!

M'()-*!c,*0/(7!&2!d/2#!c,#'(/&..*/%#!

E0/./4*0!/-$*4(!5#%','!2*4(&%!*.*07'/'!/(#-!%#3,4(/&.!&2!AbC!/(#-'!/.!AbX!*3,0('!8/()!'7-$(&-*(/4!*'()-*!

E0/./4*0!/-$*4(!,'#3!/-$*4(!*'!()#!$%&3,4(!&2!2%#?,#.47!*.3!/-$&%(*.4#!&2!/(#-'!*'!%*(#3!67!$*(/#.('!&.!*!bQ$&/.(!'4*0#!

:*4(&%!*.*07'/'!6*'#3!&.!S70*.3!*.3!+*%P'!#@40,3/.1!/(#-'!8/()!*!2%#?,#.47!&2!f]Xg;!()#.!/(#-'!8/()!/(#-Q(&(*0!4&%%#0*(/&.'!&2!0#''!()*.!X9]!*.3!()#.!/(#-'!8/()!/(#-!/.(#%Q4&%%#0*(/&.!hX9H9!:/.*007;!0&*3/.1'!0#''!()*.!X9]!&.!()#!2/%'(!2*4(&%!&2!*!$%/.4/$*0!4&-$&.#.(!*.*07'/'!8#%#!#@40,3#3!

I/22#%#.(!/.'(%,-#.('!%#',0(#3!2%&-!,'/.1!3/22#%#.(!/(#-!%#3,4(/&.!*$$%&*4)#'!<8/()!CX!/(#-'!6#/.1!/.!4&--&.!6#(8##.!LCQ/(#-!*.3!L`Q/(#-!2/.*0!/.'(%,-#.('>!

T'74)&-#(%/4!-#()&3!3/'4*%3#3!()#!)/1)#'(!/-$*4(!#-&(/&.*0!2,.4(/&.!*.3!#.5/%&.-#.(*0!/(#-'!*.3!/.40,3#3!2*(/1,#Q%#0*(#3!/(#-'!/.'(#*3!

+#*',%#-#.(!$%&$#%(/#'!8#%#!.&(!*''#''#3!&%!4&-$*%#3!

+*%@!#(!*0;!ABBB!

I/'*6/0/(/#'!&2!M%-;!D)&,03#%!*.3!S*.3!<IMDS>!c,#'(/&..*/%#!

E0/./-#(%/4!5#%','!$'74)&-#(%/4!/(#-!'#0#4(/&.F%#3,4(/&.!&2!HX!/(#-'!(&!*!LXQ/(#-!'4*0#!/.!]XH!$*(/#.('!8/()!,$$#%!#@(%#-/(7!3/'&%3#%'!

E0/./-#(%/4!-#()&3'!/.5&05#3!()#!)/1)#'(!-#*.!*33/(/5#!'4&%#'!&2!/-$&%(*.4#!*.3!'#5#%/(7!,'/.1!bQ$&/.(!$*(/#.(!%*(/.1'!

T'74)&-#(%/4!-#()&3'!/.5&05#3!#?,/3/'4%/-/.*(&%7!/(#-!(&(*0!4&%%#0*(/&.!<KNGE>!8)/4)!/.5&05#'!3/5/3/.1!'4&%#'!/.(&!Cb();!bX()!*.3!Hb()!$#%4#.(/0#'!*.3!'#0#4(/.1!/(#-'!8/()!()#!)/1)#'(!4&%%#0*(/&.'!8/()!()#!&5#%*00!'4&%#!,'/.1!()#!/(#-Q(&(*0!4&%%#0*(/&.'!2%&-!#*4)!',61%&,$!

I/22#%#.(!/.'(%,-#.('!%#',0(#3!2%&-!3/22#%#.(!/(#-!'#0#4(/&.F%#3,4(/&.!*$$%&*4)#'!<8/()!A`!&,(!&2!LX!/(#-'!/.!4&--&.>!

E%&.6*4)W'!*0$)*!8*'!)/1)!2&%!6&()!/.'(%,-#.('!*.3!()#%#!8*'!*!)/1)!NEE!6#(8##.!/.'(%,-#.('!

:/.*0!'4&%#'!3/3!.&(!3/22#%!-,4)!*2(#%!40/./4/*.!/.$,(!(&!2/.*0/=#!6&()!/.'(%,-#.('!

38

Table 2.2a: Studies using empirical methods to test differences between clinimetric and psychometric methods !D(,37! N.'(%,-#.(! E&-$*%/'&.!&2!3#5#0&$-#.(!-#()&3'! Y#',0('!a#*(&.!#(!*0;!CXXb!

c,/4PIMDS!c,#'(/&..*/%#!<&6(*/.#3!2%&-!()#!'*-#!3*(*F!$&$,0*(/&.!*'!+*%@!#(!*0>!

E&-$*%#3!L!/(#-!%#3,4(/&.!-#()&3'!/.40,3/.1!4&.4#$(Q%#(#.(/&.!*.3!C!'(*(/'(/4*007Q6*'#3!*$$%&*4)#'!<KNGE!*.3!/(#-!%#'$&.'#!()#&%7>!(&!')&%(#.!()#!LXQ/(#-!IMDS!?,#'(/&..*/%#!/.(&!*.!AAQ/(#-!c,/4PIMDS!/.!]XH!$*(/#.('!8/()!5*%/&,'!,$$#%Q0/-6!4&.3/(/&.'!

+#*',%#-#.(!$%&$#%(/#'!*''#''#3!/.!*!0&.1/(,3/.*0!'(,37!&2!CXX!$*(/#.('!8/()!,$$#%Q0/-6!3/'&%3#%'!

E&.4#$(Q%#(#.(/&.!/'!*!R,31-#.(*0!*$$%&*4)!1#*%#3!(&8*%3!%#(*/./.1!4&.4#$('!*.3!'#0#4(/.1!/(#-'!2%&-!#*4)!3&-*/.!

S/1)#'(!%*.P/.1!/(#-'!&2!KNGE!-#()&3!*%#!-#*.(!(&!3#(#4(!3/'*6/0/(7!*4%&''!()#!2,00!%*.1#!&2!'4&%#'!

Y*'4)!-&3#0/.1!2&%!()#!/(#-!%#'$&.'#!()#&%7!-#()&3!'#0#4('!/(#-'!6*'#3!&.!3/22/4,0(7!()*(!*%#!#?,*007!'$*4#3!*.3!4*0/6%*(#3!*0&.1!'4*0#!0#.1()!

I/22#%#.(!-#()&3'!%#',0(#3!/.!%#(#.(/&.!&2!3/22#%#.(!/(#-'!8/()!KNGE!*.3!Y*'4)!%#(*/./.1!()#!-&'(!/(#-'!/.!4&--&.!<H>!8)/0#!4&.4#$(Q%#(#.(/&.!)*3!b!,./?,#!/(#-'!

G8&!/(#-'!*%#!%#(*/.#3!/.!4&--&.!67!*00!()%##!-#()&3'!

M00!()%##!-#()&3'!%#-*/.#3!'/-/0*%07!%#0/*60#;!5*0/3!*.3!%#'$&.'/5#!*'!()#!&%/1/.*0;!0&.1#%!IMDS!

E&.4#$(Q%#(#.(/&.!?,/4PIMDS!8*'!'#0#4(#3!6#4*,'#!/(!)*3!()#!'(%&.1#'(!%*.P/.1!/.!(#%-'!&2!-#*',%#-#.(!$%&$#%(/#'!

Y/6#%*!#(!*0;!CXX`!

+4J#8!c,*0/(7!&2!d/2#!*2(#%!+7&4*%3/*0!N.2*%4(/&.!<cd+N>!c,#'(/&..*/%#!

M!-&3/2/#3!$'74)&-#(%/4*007!3#%/5#3!5#%'/&.!4&-$*%#3!(&!()#!&%/1/.*0!40/./-#(%/4*007!3#%/5#3!5#%'/&.!/.!(#%-'!&2!3&-*/.!4&.'(%,4(/&.F'4&%/.1!

G)#!&%/1/.*0!40/./-#(%/4!/.'(%,-#.(!8*'!3#%/5#3!,'/.1!40/./4/*.!/.$,(!*.3!$*(/#.(!/-$&%(*.4#!%*(/.1'!*.3!()#!b!',63&-*/.'!8#%#!'#0#4(#3!&.!()#!6*'/'!&2!4&.4#$(,*0!0/.P'!*.3!',--*%/=#3!/.(&!C!3&-*/.'!<#-&(/&.*0!*.3!$)7'/4*0!)#*0()>!

G)#!-&3/2/#3!$'74)&-#(%/4!5#%'/&.!8*'!3#%/5#3!67!%#-&5/.1!C!&%/1/.*0!/(#-';!*33/.1!L!.#8!/(#-'!*.3!3#5#0&$/.1!*!'4&%/.1!-#()&3!,'/.1!#@$0&%*(&%7!2*4(&%!*.*07'/'!8)/4)!%#',0(#3!/.!L!3&-*/.'!<$)7'/4*0;!#-&(/&.*0!*.3!'&4/*0>!

E%&.6*4)W'!*0$)*!*.3!()#!'(*.3*%3/=#3!%#'$&.'#!-#*.!2&%!6&()!40/./-#(%/4*007!*.3!$'74)&-#(%/4*007!3#%/5#3!/.'(%,-#.('!8*'!'/-/0*%!6,(!5*0/3/(7!8*'!6#((#%!2&%!()#!40/./-#(%/4*007!3#%/5#3!/.'(%,-#.(!6*'#3!&.!*!$%/&%/!)7$&()#'/=#3!/.(#%Q4&%%#0*(/&.'!6#(8##.!/.'(%,-#.('!*.3!#@(#%.*0!*.4)&%!4&%%#0*(/&.'!8/()!()#!D:QL`!3&-*/.'!

39

Table 2.2a: Studies using empirical methods to test differences between clinimetric and psychometric methods !D(,37! N.'(%,-#.(! E&-$*%/'&.!&2!3#5#0&$-#.(!-#()&3'! Y#',0('!G,%.#%!#(!*0;!CXXB!

T#3/*(%/4!O04#%*(/5#!E&0/(/'!M4(/5/(7!N.3#@!<TOEMN>!

I/22#%#.4#'!/.!-*()#-*(/4*0!<$'74)&-#(%/4>!*.3!R,31-#.(*0!<40/./-#(%/4>!8#/1)/.1!!&2!*!1*'(%&#.(#%&0&17!/.3#@!*''#''#3!

U%/1/.*007!8#/1)#3!-*()#-*(/4*007!,'/.1!-,0(/5*%/*(#!%#1%#''/&.!-&3#0/.1!&.!AbH!4)/03%#.!

N.3#$#.3#.(07;!R,31-#.(*0!*$$%&*4)!,'#3!*!I#0$)/!1%&,$!&2!L`!#@$#%('!(&!$%&5/3#!8#/1)('!2&%!()#!/.3#@!8)/4)!%#(*/.#3!0*6&%*(&%7!/(#-'!#@40,3#3!67!()#!-*()#-*(/4*0!-&3#0/.1!

"#/1)('!&2!/.'(%,-#.('!3#5#0&$#3!,'/.1!3/22#%#.(!*$$%&*4)#'!8#%#!'/-/0*%!

M!3/22#%#.4#!/.!*1%##-#.(!8*'!3#-&.'(%*(#3!67!a0*.3!*.3!M0(-*.!$0&('!3,#!(&!()#!3/22#%#.4#!/.!0*6&%*(&%7!/(#-!/.40,'/&.F#@40,'/&.!

E&.'(%,4(!5*0/3/(7!*.3!%#'$&.'/5#.#''!%#',0('!8#%#!#?,*007!1&&3!2&%!6&()!/.'(%,-#.('!6,(!()#!-*()#-*(/4*007!3#%/5#3!5#%'/&.!!-*7!6#!*!-&%#!2#*'/60#!/.3#@!6#4*,'#!0*6&%*(&%7!/(#-'!*%#!#@40,3#3!

40

Table 2.2b: Studies using empirical methods to test differences between clinimetric and psychometric methods

D(,37! N.'(%,-#.(! I/3!-#()&3'!0#*3!(&!*!3/22#%#.4#!/.!4&.(#.(i!

I/3!4)&/4#!&2!-#*',%#-#.(!$%&$#%(/#'!(#'(#3!3/22#%!6*'#3!&.!(7$#!&2!3#5#0&$-#.(!-#()&3i!

D,$$&%('!&,%!2%*-#8&%Pi!

"#%#!%#',0('!&2!-#*',%#-#.(!$%&$#%(/#'!',$#%/&%!2&%!&.#!-#()&3!4&-$*%#3!(&!()#!&()#%i!

e,./$#%!#(!*0;!ABBH!

M'()-*!c,*0/(7!&2!d/2#!c,#'(/&..*/%#!!

j#'! J&! j#'! J&(!*''#''#3!

+*%@!#(!*0;!ABBB!

I/'*6/0/(/#'!&2!M%-;!D)&,03#%!*.3!S*.3!<IMDS>!c,#'(/&..*/%#!

j#'! J&! j#'! J&!

a#*(&.!#(!*0;!CXXb!

c,/4PIMDS!c,#'(/&..*/%#!<&6(*/.#3!2%&-!()#!'*-#!3*(*F!$&$,0*(/&.!*'!+*%@!#(!*0>!

j#'! J&! j#'! J&!

Y/6#%*!#(!*0;!CXX`!

+4J#8!c,*0/(7!&2!d/2#!*2(#%!+7&4*%3/*0!N.2*%4(/&.!<cd+N>!c,#'(/&..*/%#!

j#'! J&! j#'! J&!

G,%.#%!#(!*0;!CXXB!

T#3/*(%/4!O04#%*(/5#!E&0/(/'!M4(/5/(7!N.3#@!<TOEMN>!

j#'! J&! j#'! J&!

41

2.3.5 Synthesis

We proposed a new conceptual framework (Figure 2.3) that links the two dominant schools of

measurement encountered in health research. Our scoping review highlighted that the science of

measurement is a specialty that includes several subspecialties which can overlap in some areas

while being distinct in others (Table 2.1). We found that the differences between clinimetrics

and psychometrics lie in the item development and precision/structure phases but not in the

measurement performance stage. Furthermore, we suggest that an overlap exists between the

different approaches of instrument development, with psychometrics and clinimetrics sitting at

opposing poles of the blended zone.

We identified three phases of instrument development. The first phase (item development and

scoring) includes the previously defined categories of item generation, reduction, definition of

response categories and scoring. The second phase (structure/precision) includes the verification

of the structure and content. The third phase (measurement performance) includes the evaluation

of the measurement properties of the instrument.

Our framework distinguishes between development stages of clinimetric and psychometric

measures. Specifically, clinimetric methods include target criterion indicators (e.g., diagnosis or

death within 24 hours) (Figure 2.3). They also include more clinical consensus regarding the

scope and structure of the measure in the structure and precision phase of measure development

(Figure 2.3). In contrast, the psychometric measures include untargeted scales that rely on

statistical approaches. Psychometric approaches favour techniques such as item total

correlations, or factor analysis and internal consistency to clinical consensus in decisions about

content.

We found several measures that were developed using a combination of both clinimetric and

psychometric principles. In some cases, consensus, clinical expert and patient opinion was

combined with psychometric-based statistical methods such as factor analysis or item-total

correlations.[99] In other cases, statistical methods and expert opinion were used to select items

that were subsequently subjected to clinical expert decisions on final item retention.[85] We

suggest that the development stages of these types of measures could be blended into a mutually

“informed zone” of overlap that merges the principles of both fields (Figure 2.3).

42

The third phase of our framework refers to testing the performance of the instrument. This phase

includes the assessment of reliability, validity and responsiveness of the instrument. The process

involves testing the performance of the numeric scores or classifications obtained from the

instrument. Both psychometrics and clinimetrics agree on the importance of this phase. Both

fields describe similar methods and analyses to study the reliability, construct validity,

responsiveness and interpretability of instruments. For example, both schools agree that

construct validity should be based on a priori hypotheses against which observed relationships

are compared and that similar analytical approaches can be used (e.g. correlations or known

groups) to test the relationships. However, differences in construct validity may exist based the

structure of the instrument (e.g. one continuous measure compared to a multidimensional

instrument). Furthermore, it is important to note that statistical methods do not apply equally to

both schools. For example, internal consistency is useful in psychometrics to ensure that the

multiple items belong to the same construct. This is not the case for clinimetric tools because the

items do not need to be correlated. The measures developed in the informed zone are more

challenging. They may include multiple items of the same construct, but may also include the

full range of the clinical experiences that do not represent a single factor (e.g. Asthma Quality of

Life Questionnaire).

2.4 Discussion We conducted a scoping literature review and proposed a new conceptual framework that unites

clinimetrics and psychometrics. We suggest that the ongoing debate between clinimetrics and

psychometrics is unnecessary and creates uncertainty as to the value of well-developed outcome

measures. We proposed to resolve this debate by identifying the unique strengths of each field in

order to specify situations when they should remain unique or be combined for use by both

fields.

Our framework is based on the literature that compares clinimetrics and psychometrics. Several

authors have demonstrated that the nature and structure of an instrument can be highly dependent

on the measurement school that influenced its development (Table 2.2b). Others point to the

differences between clinimetrics and psychometrics but also to the overlapping informed zone

that is common to proponents of both fields. Specifically, de Vet et al and Streiner et al

43

suggested that clinimetric indexes are more content driven and psychometric indexes are more

statistically driven although both approaches are used by both fields.[32,118]

In the current “Era of Assessment and Accountability”, measuring outcomes to identify

appropriate and cost-effective treatment approaches is a major focus in health care. Our search

demonstrates that the amount of literature published during this era is growing rapidly and that

an increasing level of attention focuses on the psychometric-clinimetric divide. In a recently

published textbook, de Vet et al stated that appropriate methods must be used by both fields

regardless of the clinimetric-psychometric distinction.[33] These authors refrained from

distinguishing measurement properties as clinimetric or psychometric. This suggests that clinical

measurement has already started moving toward integrating methods. Furthermore, this

international multi-disciplinary research group (largely clinimetric in focus) has reached

consensus on the taxonomy, terminology, and definitions used in the measurement field.[89]

They are currently developing critical appraisal tools to identify studies of high methodological

quality in systematic reviews on measurement properties which will help provide

recommendations for standardized methods in future publications (www.cosmin.nl/). Our

framework can serve to guide the development of such evaluation tools leading to

standardization by providing clarity on distinctions in instrument development that should be

graded differently.

Our study has strengths and limitations. We used scoping study methods to obtain information

relevant to the measurement debate. Using these established methods to perform our review is a

strength of our study. A scoping study differs from a systematic review because authors do not

assess the quality of included studies.[3,82] However, scoping studies usually address broader

topics in which different types of articles may be applicable in contrast to systematic reviews

which address specific questions through a relatively narrow range of quality-assessed

studies.[3,82] Our literature search highlighted that indexing of articles on the topic of

clinimetrics was limited. We used keywords instead of subject headings or other database-

specific indexing descriptors to search the literature on clinimetrics. Indexing descriptors

provide a controlled vocabulary for use in bibliographic records. This controlled vocabulary

leads to results with more specificity in literature searches. In our search, no indexing

descriptors relevant to clinimetrics were available (e.g. no relevant Medical Subject Heading

[MeSH] in Medline). This likely limited our ability to capture articles that were specifically on

http://www.cosmin.nl/

http://en.wikipedia.org/wiki/Controlled_vocabulary

http://en.wikipedia.org/wiki/Bibliographic_record

44

clinimetrics. In fact, five of our included studies were obtained from article bibliographies as our

search could not target them. In addition, some journals were not consistently indexed in

databases (e.g. our database searches missed two of the included articles because only selected

issues of the journal were indexed). We propose that clinimetrics should be introduced as a

MeSH term in Medline to assist with more accurate indexing of articles on the topic. However,

Streiner pointed out that searching specifically for clinimetric articles might misdirect

researchers and clinicians and lead to ignorance of the vast contributions of psychometrics to

measurement.[118] Therefore, we propose that ‘measurement properties’ should be the indexing

descriptor for all measurement fields assessing instrument properties. This comprehensive term

would improve database searches on the topic by capturing all relevant literature from both fields

in electronic database literature searches. Until the indexing methods get changed, a

combination of all relevant terms needs to be used in comprehensive searches to avoid missing

articles. As apparent from one of our included studies (Table 2.1), some authors continue to use

terminology such as ‘clinimetric properties’ even in 2012 in a way that may overlook

contributions from the psychometric field.[43] We suggest that our framework should be used as

a guide in future instrument development and evaluation to avoid the continuation of this divide.

The development of our framework may have been limited by the available literature.

Specifically, a limitation that affected our search strategy is the limited number of relevant

MeSH terms available to conduct our search. This may have lead to missing relevant articles

since subject heading searches are designed to capture relevant literature more accurately.

However, we used variations of keywords to compensate for this limitation. Similar problems

were encountered by other authors in searching for literature on measurement properties.[124]

PubMed search filters to capture articles on instrument measurement properties have been

developed for this reason.[124] These filters should be used by researchers performing

systematic reviews and by clinicians scoping the literature for instruments with sound

measurement properties to use clinically. Our search aimed to capture articles on the clinimetric-

psychometric debate excluding articles discussing measurement properties of specific

instruments with no discussion of the difference between fields. Therefore, these filters were not

appropriate for our scoping review. Other schools of measurement, such as sociometrics and

anthropometrics, may also have unique features, but we have not reviewed them in this paper.

Literature on item response theory (IRT) was not included in our literature review or in the

45

proposed framework because it is based on a different theory. However, IRT likely belongs in

the informed overlap zone of instrument development. Specifically, IRT aims to develop

unidimensional instruments (similar to psychometric methods), but it provides different score

weights for items of different difficulty levels within an instrument. This suggests that a single

instrument can include multiple constructs rather than only highly inter-correlated attributes of a

single construct (similar to clinimetrics). However, this literature was not evaluated in the

current study and needs to be assessed in future research to determine its relationship and

contribution to the presented conceptual framework. Considering that the two measurement

fields (i.e. clinimetrics and psychometrics) demonstrate both uniqueness and an overlap, future

research should revisit other measurement fields. In doing so, we can determine if all

measurement fields have unique development stages and similar ways of evaluating the

performance of their outcome measures.

2.5 Conclusion Our scoping review provides the supporting evidence for a new framework for measurement

development and evaluation. The framework proposes a shift in the early conceptual

foundations and development of a measure but finds a converging point in the measurement

performance of an instrument (e.g. reliability, validity, responsiveness). Our framework

highlights that many measures used in clinical medicine blend features of clinimetric (focus on

patient and expert input) and psychometric approaches (focusing on statistical analysis).

Assessing the quality of blended measures is challenging compared to assessing purely

clinimetric or psychometric measures. However, we found that all measures converge to the

same nomenclature and methods in the measurement performance stage. Therefore, labeling

measurement properties as a clinimetric or psychometric may not be useful. Our new framework

will help scientists understand the measurement methodology by bringing together information

from both sides of a protracted debate.

46

Chapter 3 :

Can Recovery from Whiplash-associated Disorders be Measured

Reliably in Patients with Acute Whiplash-Associated Disorders?

A Test-retest Reliability Study of the Whiplash Disability

Questionnaire

3.1 Introduction More than 80% of individuals injured in traffic collisions suffer from whiplash-associated

disorders (WAD) and 50% of those will experience neck pain one year later.[15,19] Moreover,

whiplash is an important source of disability.[15] However, disability is difficult to define and

measure because it is highly contextualized and varies from person to person, place to place, and

from situation to situation.[5] Most of the current measures used to measure WAD-related

disability lack comprehensiveness and they have only been studied in sub-acute and chronic

patients.[63,98]

To be clinically useful, self-report outcome measures must be valid and reliable. This is

important to understand the day to day variability in score (reliability) and to quantify true

changes in state.[119] The minimal detectable change (MDC) assesses this variability as the

minimal change an instrument can detect over the day-to-day variability of individuals with a

stable condition.[24,138] High reliability is necessary when interpreting results such as change

scores and individual responses to interventions to ensure that change detected by the instrument

is due to true change in state beyond the daily variability of individuals with a stable condition

(i.e. >MDC).[24,138]

The WDQ is a self-report questionnaire based on the International Classification of Functioning

(ICF) framework of disability and includes items from the Neck Disability Index (NDI) such as

pain intensity, personal care, lifting and work.[99,131] It also includes items important to WAD

patients (fatigue, participation in sports, depression, social activities and anger).[63,99] The

WDQ was designed as an evaluative tool to measure response to treatment and its psychometric

properties have been studied in chronic WAD patients.[78,99] Its test-retest reliability was

reported to be excellent (ICC[3,1]=0.93 over 1 month) and its MDC was adequate (MDC=15

47

points out of 130 with 90% confidence).[49,99,140] Development in a chronic population may

restrict its use in patients with acute injuries because their disability status is likely to change

more rapidly.[111]

The test-retest reliability and the MDC of the WDQ are unknown in patients with acute WAD.

The purpose of our study was to determine the short-term test-retest reliability of the WDQ and

its MDC in a cohort of patients with acute WAD. We also aimed to determine whether the WDQ

test-retest reliability varied with WAD grade and with a participant’s recall of their baseline

WDQ responses.

3.2 Methods

3.2.1 Participants

Eligible participants made an insurance claim for traffic injuries to AVIVA Canada between

February 2008 and August 2009. Participants were included if they: 1) were at least 18 years of

age; 2) resided or worked in the Greater Toronto area, Burlington, Cambridge or the Kitchener

area; 3) were diagnosed with WAD Grades I-III[113] by two trained study

coordinators/chiropractors; and 4) had WAD of less than 3 weeks in duration. Participants were

excluded if they: 1) were unable to provide written informed consent; 2) were unable to complete

the interview in English; and 3) had a history of neck surgery.

3.2.2 Procedure

Potential participants were recruited alongside the University Health Network (UHN) Whiplash

Intervention Trial but participation for this study was offered regardless of their eligibility for the

trial.[26] Informed consent was obtained from all participants prior to enrolment in the study.

The University Health Network and University of Toronto Research Ethics Boards approved the

study.

3.2.3 Data

Data collection: The WDQ, change in neck pain question and a memory question (i.e. “Do you

remember your answers to the questions asked three-days ago?”) were administered in a

standardized in-person interview at baseline and again 3-5 days later. A 3-5 day period was

48

selected as a reasonable time frame within which participants would not remember their previous

responses and would be unlikely to experience a change.

Stability indicator: We determined whether a participant’s condition was stable (indicator of

stability) using the self-rated change in neck pain question: “How do you feel your neck pain has

changed since the injury?”.[90] This question included seven-Likert response options ranging

from ‘Very much better’ to ‘Very much worse’. We chose this question because it has good test-

retest reliability and contains a time anchor.[90] We used two indicators of stability: 1) the ‘No

change’ response option and 2) an expanded definition of stability including response options

‘Slightly better’, ‘No change’, and ‘Slightly worse’. A similar expanded definition was used in

previous research.[72]

Whiplash Disability Questionnaire (WDQ): The WDQ includes 13 items that measure the effect

of whiplash (Table 4).[99] Each item is scored from 0 (no impact) to 10 (greatest impact) on a

numerical scale. The responses are summed from 0 (no disability) to 130 (complete

disability).[99] As recommended by developers, missing item values were considered zeros in

the summation to obtain a total WDQ score.[99,140]

3.2.4 Sample Size

The sample size required to detect an Intra-class Correlation Coefficient (ICC) [model 2,1] of 0.9

using a lowest acceptable ICC value of 0.8 at a 0.05 level of significance is 46.[81,111,133]

3.2.5 Analysis

3.2.5.1 Test-Retest Reliability

We used Shrout and Fleiss Model 2,1 for multiple-raters to calculate the ICC.[111,119,138]

Model 2,1 reflects the repeated structure of the data and allows the results to be interpreted for

more than our specific testing situation.[111,138] We used an ICC value of 0.8 or above as a

standard for a reasonable level of reliability for group level analyses and 0.9 as a reasonable level

for analysis at the individual level.[100] ICC’s and 95% confidence intervals (CI) were

calculated for overall scores and for individual items. Missing values were excluded from

individual item ICC calculations. We used a memory question (i.e. ‘Do you remember your

49

answers to the questions [asked three days ago]?’) to determine if the ICCs were sensitive to

memory. Finally, we calculated the ICCs in participants with different WAD grades to

determine if test-retest reliability values were affected by WAD grade.

3.2.5.2 Minimal detectable change

The MDC statistic at 95% confidence was calculated using the standard error of measurement for

repeated measures.[24] The value of the MDC95 represents the change above which there is 95%

confidence that the change is greater than the day to day variability of a stable

participant.[24,138] The MDC is, therefore, determined in participants reporting no change. We

calculated the MDC using two methods. First, we calculated the standard error of measurement

(SEM) using the standard deviation of the mean baseline total WDQ and the ICC for the no

change group and then the MDC at 95% confidence. Second, we obtained the SEM directly

from a Repeated Measures Analysis of Variance (ANOVA) in the form of the root mean square

error (eliminating chances for potential errors in MDC calculations using standard deviation and

the ICC).

3.2.5.3 Sensitivity Analyses

We performed sensitivity analyses to determine the impact of missing data on the test-retest

reliability. We repeated the analyses after imputing the midpoint WDQ item value (5), the

highest item value (10), and the mean of other WDQ items for the individual.[65] A complete

case analysis was also performed by calculating an ICC (2,1) for the participants in the entire

sample and the participants in the group reporting no change who had no missing values.

All statistical analyses were performed using SAS software (SAS 9.1 for Windows, SAS

Institute Inc., Cary, NC, USA).

3.3 Results WDQ data was obtained from all 66 participants at both administrations. At follow-up, 62

participants were asked questions about remembering their baseline and change in neck pain was

obtained from 54 participants. These questions were added to the follow-up after data collection

50

was initiated. Therefore, there were 4 missing values for remembering the baseline and 12

missing values for the change in neck pain question because the questions were not administered.

3.3.1 Descriptive statistics

On average, participants were enrolled 5.6 days following their collision. The sociodemographic

characteristics of the sample are presented in Table 3.1. The mean age of the sample was 41.6

years and 71.2% were females. The mean baseline WDQ score was 49.3 out of the total score of

130 and 46.5 at follow-up. Of the 54 participants who were asked the change in neck pain

question, 15 (27.8%) reported no change, 31 (57.4%) reported to be slightly or very much better

and eight (14.8%) reported to be getting worse. The mean baseline WDQ score for participants

reporting no change was 56.9 at baseline and 56.6 at follow-up.

3.3.2 Completeness of WDQ

At baseline, 24.2% (16/66) had one missing item and one participant (1.5%) had two missing

items. At follow-up, 16.7% (11/66) of the sample had one missing item. The most common

question with missing values was the effect of whiplash injury on sporting activities (10.6% of

the entire sample and 26.7% of the sample with no change) followed by the effect of whiplash

injury on driving or using public transportation (1.5% of entire sample and 6.7% sample with no

change). Complete case analysis (with no missing values) included 46 participants in the entire

sample and 11 in the group reporting no change. Demographics of participants with missing

values did not differ compared to participants with complete data.

51


disorders.

Characteristic Entire Sample No Change Subgroup N = 66 N=15

Female, no. (%) 47 (71.2) 8 (53.3) Age, years Mean (SD); range 41.6 (12.7); 19.6-73.5 43.3(12.3); 19.6-66.5 Time since injury, days Mean (SD); median; range 5.6 (4.4); 4.0; 0-19 4.8 (3.5); 4.0; 1-15 WAD grade I, no. (%) 19 (28.8) 4 (26.7) II, no. (%) 47 (71.2) 11 (73.3) III, no. (%) 0 (0) 0 (0) WDQ Total Score, Mean (SD); median; range

49.33 (28.8); 48.5; 2-116 56.93 (18.6); 53.0; 25-92

Highest level of education, no. (%) High school or less 11 (16.7) 5 (33.3) Post secondary or some university 18 (27.3) 3 (20.0) Technical school graduate 11 (16.7) 2 (13.3) University graduate 26 (39.4) 5 (33.3) Income, no. (%) $0-$49,999 34 (51.5) 5 (33.3) $50,000-$59,999 11 (16.7) 3 (20.0) $60,000-$79,999 7 (10.6) 2 (13.3) $80,000+ 12 (18.2) 4 (26.7) Did not respond 2 (3.0) 1 (6.7) Lawyer Involvement in the Claim (%) 0 (0) 0 (0) Pain Intensity, Mean (SD)* Neck 5.74 (2.0) 6.40 (1.3) Shoulder 4.32 (3.0) 5.20 (2.7) Low Back 3.97 (3.4) 2.67 (3.4) Headache 3.75 (3.2) 3.73 (3.2) Arm 2.28 (2.8) 2.53 (2.9)

Abbreviations: SD = standard deviation *Numeric rating scale of 0-10 (0 = no pain and 10 = worst pain ever)

3.3.3 Test-retest reliability

The ICC(2,1) for the total WDQ score was 0.89 (95% CI 0.85-0.92) [Table 3.2]. Participants

who remembered their responses had similar test-retest reliability compared to participants who

52

did not remember their previous responses. The ICC was similar across WAD grades and for

those who reported a change or no change in their neck pain [Table 3.2]. Sensitivity analysis

demonstrated that the ICCs were not influenced by missing data [Table 3.3].

Table 3.2: Intra-class Correlation Coefficient for the Total Summary Score categorized by the

report of no recovery on the change in neck pain question and memory effects

Total Summary Score of the WDQ n ICC (2,1) 95% CI

All participants 66 0.89 0.85-0.92 WAD grade 66 Grade I 19 0.82 0.70-0.91 Grade II 47 0.88 0.83-0.92 Question about remembering the baseline responses answered 62 Participants reporting that they remember their responses 36 0.89 0.84-0.94 Participants reporting that they do not remember their responses 26 0.85 0.76-0.92 Change in neck pain question answered 54 Participants reporting no change 15 0.83 0.69-0.93 Participants reporting slight to no change 32 0.85 0.77-0.91

3.3.4 Individual item test-retest reliability

The test-retest reliability for each of the 13 WDQ items ranged from ICC = 0.60 (95% CI 0.57-

0.77) for non-sporting leisure activity to ICC = 0.85 (95% CI 0.80-0.90) for the anxiety-related

question [Table 3.4]. The questions on whiplash-related pain (ICC=0.66; 95% CI 0.56-0.75) and

on the effect of the whiplash injury on sporting activities (ICC=67; 95% CI 0.56-0.87) also had

lower ICCs. The results for participants reporting ‘No change’ on the change in neck pain

question were similar for most items. However, some of these estimates differed. For example,

the reliability was lower for items related to work/home/study duties, tired/fatigued, non-sporting

leisure activity and anger [Table 3.4]. While this may be related to less precise estimates

(because of the sample size (n=15)), it is also possible that the reliability of these items is worse

for participants who report no change in neck pain.

53

Table 3.3: Sensitivity Analysis for the Intra-class Correlation Coefficient for the Total Summary

Score

Imputed value n ICC(2,1) 95% CI Entire sample 0 66 0.89 0.85-0.92 5 66 0.89 0.85-0.92 10 66 0.88 0.84-0.92 Mean of individual 66 0.89 0.85-0.92 Excluding missing values 46 0.89 0.85-0.92 Sample reporting no change 0 15 0.83 0.69-0.93 5 15 0.85 0.73-0.94 10 15 0.87 0.76-0.95 Mean of individual 15 0.84 0.72-0.94 Excluding missing values 11 0.89 0.77-0.96

Table 3.4: Intra-class Correlation Coefficient for individual items of the WDQ

Entire Sample Subgroup reporting no change

Item Individual Item theme n* ICC (2,1) (95 % CI) n* ICC (2,1) (95 % CI)

1 Pain 66 0.66 0.56-0.75 15 0.76 0.59-0.90 2 Personal care 66 0.81 0.75-0.87 15 0.84 0.72-0.94 3 Work/home/study duties 65 0.75 0.67-0.82 15 0.54 0.27-0.79 4 Driving/Public transport use 64 0.78 0.70-0.84 14 0.71 0.50-0.88 5 Sleep 66 0.78 0.71-0.84 15 0.74 0.55-0.89 6 Tired/Fatigued 66 0.72 0.64-0.80 15 0.19 -0.11-0.56 7 Social activity 65 0.84 0.78-0.89 15 0.77 0.60-0.90 8 Sporting activity 50 0.67 0.56-0.78 11 0.58 0.31-0.82 9 Non-sporting leisure activity 65 0.60 0.49-0.71 15 0.44 0.15-0.73 10 Sadness/depression 66 0.81 0.75-0.87 15 0.62 0.37-0.83 11 Anger 66 0.74 0.66-0.82 15 0.35 0.05-0.68 12 Anxiety 66 0.85 0.80-0.90 15 0.78 0.62-0.91 13 Concentration 66 0.75 0.67-0.82 15 0.81 0.67-0.92

* Missing items were excluded in individual item ICC calculations

54

3.3.5 Minimal detectable change

For the 15 participants who reported no change in neck pain, the MDC95 was 21.4 (SD=14.9).

The parameters used in this calculation were SDbaseline WDQ = 18.6 and the entire sample ICC

which produced an SEM of 7.7. The MDC and SEM obtained from the ANOVA were 21.9 and

7.9, respectively. These results suggest that the WDQ requires a change of 22/130 points before

one can be 95% confident that the change was beyond the daily variability of an individual with

a stable condition.

3.4 Discussion In our study of participants with acute WAD, the WDQ demonstrated very good reliability and

moderate boundaries of error (i.e. one sixth of the scale). Our stratified analysis demonstrated

that the WDQ remained reliable regardless of WAD grade, memory effects or the report of no

change in neck pain.

Our results agree with the previous study of short- and medium-term test-retest reliability in

patients with chronic WAD.[140] An ICC(3,1) measured over one month was reported as 0.86

(n=52) and 0.93 in 24 participants who reported no change in their condition. Contrary to our

analysis, Willis et al. used a Model-3 ICC to compute reliability.[140] They justified using this

model because they administered the WDQ (one questionnaire) even though it was administered

at two time points. This may have led to an overestimation of the test-retest reliability measured

from the same patient at two time-points. We were interested in estimating the stability of the

WDQ over time; therefore, Model 2 (random effects of time interval) was indicated instead of

Model 3 (fixed effects).[138]

We found that some WDQ items demonstrated adequate reliability (i.e. anxiety, social activity,

sadness/depression and personal care) while others did not (i.e. pain, sporting activity, non-

sporting leisure activity). Most items (on their own) were less reliable than the total WDQ score.

It is known that the ICC statistic can be affected by the range of possible scores.[138] This

likely lead to differences in ICCs because the range of possible scores is smaller for individual

items (out of 10) than for the overall score (out of 130). Furthermore, the sample size in the

group reporting no change was small and the ICC estimates for individual items therefore had

55

poorer precision in that subgroup. The follow-up period of 3-5 days was chosen to ensure WAD

stability in the entire sample. Specifically, minimal to no change was expected during this very

short interval of time in the acute phase of WAD. Therefore, the results of the entire sample were

a good demonstration of reliability in this population of stable participants with acute WAD.

Previous research in Australia has found MDC values similar to our results.[140] Willis et al

reported the 90% MDC of the WDQ over one month to be 15 points for 24 subjects in the

population with chronic symptoms.[140] Based on convention, we reported the MDC with the

95% confidence intervals.[138] Our MDC95 was larger (22 points) because we used wider

confidence intervals to compute the MDC and because there is more variability in symptoms in a

sample of acute WAD participants than in chronic ones. In our sample, the MDC90 was 18

points (n=15) which is very similar to the Australian figure.

Our study had strengths: 1) the recruitment of a sample with acute WAD injuries; 2) a perfect

follow-up rate (100%) and few missing values; 3) the use of the appropriate statistical model to

compute the ICCs (ICC model 2,1); and 4) the use of the conventional 95% confidence intervals

to estimate the MDC boundaries of error.[35,100,138] However, it also had limitations. One of

the limitations was missing data, specifically with the sporting activity item. This may be related

to the fact that we studied an acute sample of patients who may not have been able to return to or

attempt their sporting activity. We conducted a sensitivity analysis and found that the ICC

remained stable using complete case analysis and with the imputation of means and extreme

values. Therefore, we found no evidence that missing data biased our results. Another limitation

was a short period of time for repeat administration of the same questionnaires (3-5 days). This

may have affected the results if participants remembered their original answers; however, we

found that memory did not have a significant effect. Since the environment differed between

administrations (i.e. in-person interview at baseline and telephone interview at follow-up), this

can also be considered a limitation. However, both administrations were interviewer-

administered (with the interviewer verbally asking participants WDQ questions). Therefore, the

type of administration should be considered similar for both interviews and this additional

variability in conditions would have likely biased the reliability estimates toward the null. Since

the statistics were satisfactory, it suggests that true reliability values are likely better than what is

reported in this study and that measurement error may be overestimated. Finally, the change in

neck pain question was used to detect change over time in reliability testing. However, this

56

question asked if the participants’ neck pain had changed since the injury, not since the baseline

questionnaire was administered. It is possible that most of the symptom change occurred since

the injury, but before the baseline was administered. This would result in a portion of the sample

being falsely classified as changed when they have not changed between test administrations.

The ICC, therefore, was limited for the group reporting no change due to the inaccuracy of the

change in neck pain question, but adequate reliability was, nevertheless, demonstrated in both

groups.

Future research assessing WDQ responsiveness needs to consider our MDC results in their

calculations and in their assessment of the minimal clinically important change. Also, while

reliability is necessary in research because it allows accurate interpretation of results by

minimizing measurement error in statistical analysis, it is not sufficient on its own to establish

the usefulness of a measure.[24] Therefore, studying construct validity is the next step in

establishing the psychometric properties of this measure.

3.5 Conclusion

The results of this study suggest that the WDQ has very good test-retest reliability in individuals

with acute WAD. The reliability of the WDQ remains stable for participants reporting no change

which supports its use in research and in clinical practice. The WDQ had wide boundaries of

error based on the MDC value. Therefore, it may be limited in detecting true change in an

individual patient.

3.6 Acknowledgement

This study was funded by an industry grant from AVIVA Canada Incorporated to the University

Health Network for the UHN Whiplash Intervention Trial. Maja Stupar was funded by a Vanier

Canada Scholar Canadian Institutes of Health Research award. The authors declare no conflicts

of interest.

57

Chapter 4 :

Exploratory Factor Analysis, Validity and Responsiveness of the

Whiplash Disability Questionnaire in Adults with Acute Whiplash-

associated Disorders

4.1 Introduction Although the burden of WAD-related disability is significant in society, there are few WAD-

specific measures with sound measurement properties that can be used to describe its impact on

individuals and society.[61,63] One of the available instruments is the Whiplash Disability

Questionnaire (WDQ), a 13-item disability measure that has been validated in adults with

chronic WAD and that can be used to provide a summative total score of whiplash-related

disability.[99,140] The development of the WDQ was based on the disability framework of the

International Classification of Functioning, Disability and Health (ICF) which includes the

constructs of impairment, activity limitations and participation restriction.[121,142] In addition

to the items included in previous instruments (e.g., neck pain, impact on personal care, lifting,

concentration) the WDQ includes items deemed important by individuals with WAD (e.g.

fatigue, participation in sports, depression, socializing with friends).[63,99] Therefore, the WDQ

includes items that cover multiple concepts, suggesting that it may contain distinct factors or

constructs. However, previous research in the chronic WAD population indicates that the WDQ

contains only one factor.[99] It is important to note that this determination was made using

principal component analysis, a method that is not designed to determine the factor structure of

measurement tools. To our knowledge, no one has used factor analysis to determine the factor

structure of the WDQ in the acute WAD population.

The measurement properties of the WDQ have been studied in a chronic and stable WAD

population.[99,140] Willis et al reported that the WDQ has excellent test-retest reliability (ICC

= 0.90 over 24 hours and ICC = 0.96 over one month) in chronic patients.[140] Similarly, we

found that the short-term test-retest reliability was adequate [i.e. ICC > 0.85] (n=66) in adults

with acute WAD (within 21 days of their accident). Face validity and responsiveness of the

WDQ were only studied in chronic WAD; its validity and responsiveness in acute WAD samples

58

is unknown.[99,140] Pinfold et al reported that according to a multidisciplinary medical

committee panel, the WDQ has reasonable face validity.[99]

Given the lack of standard methods to determine responsiveness, the responsiveness of the WDQ

has been studied using various methods. Specifically, Willis et al reported: 1. the responsiveness

statistic for the subgroup of participants reporting recovery or worsening on a global recovery

question; 2. they reported the effect size and standardized response means for the overall chronic

study population and; 3 they correlated overall WDQ changes scores with the global recovery

question as the external anchor.[140] The responsiveness statistic was reported as 1.06 for

participants who improved over one month and -1.86 for those who got worse.[140] These

values suggest reasonable change over time in the appropriate direction for those reporting

change by demonstrating values that were not close to zero.[100,144] In contrast, effect size and

standardized response means calculated for the overall study population were almost zero, which

suggests that the overall study population had minimal to no change in symptoms over one

month.[140] Although effect size and standardized response means can be used similarly to the

responsiveness statistic to demonstrate change over time using an external anchor, Willis et al

only reported the lack of change in the overall study population using these statistics.

Spearman’s rank correlation between the WDQ change scores and patient-perceived recovery

(scale ranging from -5 to +5) was adequate (r = 0.67).[140] Although correlations of 0.7 are

often used to support responsiveness, it must be remembered that values below 0.7 are common

because of the measurement error associated with each instrument when measuring change over

time (i.e., compared to using actual scores of each instrument).[33]

The validity and responsiveness of the WDQ need to be assessed in the acute WAD population

to determine if the instrument can be used in patients with recent injuries. The purpose of this

study was to determine the factor structure, construct validity and responsiveness of the WDQ in

a sample of individuals with acute WAD.

4.2 Methods

4.2.1 Participants and Procedures

Eligible participants made an insurance claim for traffic injuries to AVIVA Canada between

February 2008 and August 2009. Participants were included if they: 1) were at least 18 years of

59

age; 2) resided or worked in the Greater Toronto area, Burlington, Cambridge or the Kitchener

area at the time of their motor vehicle collision; 3) were diagnosed with WAD Grades I-III[113]

by a trained study coordinator; and 4) had WAD of less than 3 weeks in duration. Participants

were excluded if they: 1) were unable to provide written informed consent; 2) were unable to

complete the interview in English; and 3) had a history of neck surgery.

Potential participants were recruited and assessed by two study coordinators who determined

their eligibility for participation in the University Health Network (UHN) Whiplash Intervention

Trial.[26] The eligibility assessment included three telephone screening questions (i.e., age, self-

report of neck pain intensity on a 11-point numerical rating scale (NRS) and time since injury).

Those individuals who met the telephone screening criteria were invited to a clinical assessment.

This assessment included a history, physical examination and imaging when necessary. Potential

participants were asked to participate in the current study regardless of their eligibility for the

trial. The eligibility for the trial was slightly different in that only participants with WAD Grade I

and II were included and those participants were randomized to different treatment groups. The

current cohort study also included WAD grade III and participants who were not randomized

into the trial. The Quebec Task Force defined WAD grade I clinical presentation as a neck

complaint of pain, stiffness or tenderness only and WAD grade II as a neck complaint with

musculoskeletal signs such as decreased range of motion and point tenderness.[113] WAD grade

III also includes neurological signs (e.g. decreased tendon reflexes, weakness, sensory deficits).

Informed consent was obtained from all participants prior to enrolment in the study. The UHN

and the University of Toronto Research Ethics Boards approved the study.

4.2.2 Data Collection

Participants completed an in-person, interviewer-administered questionnaire at baseline and at a

6-week follow-up in-person, or telephone interview. At baseline, we collected data on

demographics, the whiplash disability (WDQ), pain intensity (Numerical Rating Scale), neck

disability (Neck Disability Index [NDI] and the Bournemouth Questionnaire), mental health

(CES-D), and general health (SF-36). Whiplash disability (WDQ) and a global recovery

question were also collected by interviewer-administered questionnaire at 6-weeks.

60

4.2.2.1 Whiplash Disability Questionnaire

The Whiplash Disability Questionnaire (WDQ) consists of 13 items that measure the effect of

whiplash on pain, personal care, work/home/study duties, driving/public transportation, sleep,

tiredness/fatigue, social activity, sporting activity, non-sporting leisure activity,

depression/sadness, anxiety, anger and concentration.[99] Each item response is rated on a

numerical scale from 0 (no impact) to 10 (greatest impact). The questionnaire responses are

summed for a maximum possible total of 130 points (designating complete disability) and the

minimal possible score of 0 (designating no disability).[99] The WDQ was conceptualized using

the ICF and the selection of its items was inspired by the Neck Disability Index (NDI) and a list

of WAD features deemed important to patients.[63,99,142] Items such as pain intensity,

personal care, lifting and work were obtained from the NDI, which is a 10-item measure of neck

disability.[131] In addition, the WDQ contains items found to be important to WAD patients

such as fatigue, participation in sports, depression, social activities and anger.[63,99] The WDQ

was designed as an evaluative tool to measure change in whiplash-related disability over

time.[78,99]

4.2.2.2 Numerical Pain Rating Scale

The numerical rating scale (NRS) is widely used in pain research.[60] Its psychometric

properties were found to be adequate in musculoskeletal injuries.[21,22,96] Specifically, it has

moderate test-retest reliability (ICC=0.76, 95% CI 0.51-0.87; period=2.5+/-0.96 days) and

adequate responsiveness (AUC=0.85, 95% CI 0.78-0.93) in patients with neck pain.[22] In

patients with acute injuries, the NRS performs well with scores that are highly correlated with

the pain intensity measured with the Visual Analogue Scale (VAS).[8] The NRS has good

discriminant qualities in patients with acute pain and better reliability than the VAS in patients

with trauma.[7]

4.2.2.3 Neck Disability Index

The Neck Disability Index (NDI) is a functional status, self-report questionnaire with 10 items.

Each item has six possible responses describing the level of severity. Level 0 in each item

denotes no pain/disability and level 5 describes maximal pain/disability. The scores of each item

are summed to a maximum total of 50 points. The summed score can be used as a measure of

disability based on the originally proposed scale (0-4 no disability; 5-14 mild; 15-24 moderate;

61

25-34 severe; >35 complete disability), or multiplied by two to obtain a percentage ranging from

no disability (0%) to complete disability (100%).[131] The NDI was designed to measure

change over time of neck pain-related disability in whiplash and persistent neck pain patients in

response to treatment.

The NDI has been widely used as a measure of pain and disability for neck pain and WAD in

clinical and research settings. The NDI is a reliable and responsive measure in patients with

acute neck pain similar to the WAD patients under investigation in this thesis.[22,56,116,139]

The construct validity of the NDI is reported to be adequate.[56,139] However, the face and

content validity of this region-specific measure may not capture all aspects of whiplash-related

disability. Although neck pain is the cardinal symptom of whiplash, it is certainly not the only

symptom experienced by patients with WAD. Furthermore, the ordinal scaling of this measure

at the item-level may not satisfy the interval-level scaling requirement needed for commonly

applied statistical methods of analysis used in clinical research.[129]

4.2.2.4 Neck Bournemouth Questionnaire

The Neck Bournemouth Questionnaire (NBQ) was developed in 2002 to measures neck-related

disability.[10] The authors report that it attempts to capture the affective and cognitive

dimensions of neck pain that are overlooked by other commonly used neck pain disability

measures (i.e. Northwick Neck Pain Questionnaire, NDI, Copenhagen Neck Pain (CNP) and

Neck Pain and Disability Scale).[10] The NBQ consists of seven items, each with response

options on a scale from 0 (no impairment) to 10 (complete impairment). The questions address

neck pain intensity, activity limitations (i.e., work, daily activities, recreational, social and

family), emotional symptoms (i.e., anxiety and depression), effect of work on neck pain, and

ability of the subject to control the neck pain.[10] Preliminary studies suggest that it has good

internal consistency (Cronbach’s alpha=0.9), moderate test-retest reliability (ICC=0.65), and

acceptable responsiveness (Cohen and Kaziz effect sizes greater than one and better than NDI

and CNP) in individuals with nonspecific neck pain.[9,10,66] The NBQ has acceptable construct

validity when compared with the NDI (Pearson’s r of 0.51-0.71 for overall NBQ), SF-36 Health

Survey (moderate Pearson’s r correlations of -0.43 to -0.59 with individual items of the NBQ for

the physical health, mental health and social functioning SF-36 domains) and the Copenhagen

Neck Functional Disability Scale (Pearson’s r of 0.48-63).[10]

62

We chose the Neck Bournemouth Questionnaire instead of other possible neck disability

instruments because the Neck Bournemouth Questionnaire aims to capture affective and

cognitive symptoms related to neck pain. Moreover, other measures such as the Northwick Neck

Pain Questionnaire have items identical or very similar to the NDI items.[132] Since the WDQ

was developed from the NDI, the Neck Bournemouth Questionnaire was chosen to avoid

potentially artificial inflation of correlations between the constructs based on identical items.

4.2.2.5 CES-D

It is well documented that depressive symptomatology has a negative effect on recovery from

whiplash injuries.[14,107] The CES-D is a widely used questionnaire with adequate

psychometric properties developed to measure the frequency of depressive symptoms.[102] It is

a 20-item scale developed by The National Institute of Mental Health with responses in a Likert

format ranging from 0 (lasting less than one day out of a week) to 3 (lasting all the time; 5-7 days

within a week). Total summative scores range from 0 to 60, with higher scores reflecting greater

levels of depressive symptoms and lower scores reflecting lower levels of symptoms. The CES-

D has 4 separate factors: depressive affect, somatic symptoms, positive affect, and interpersonal

relations.

4.2.2.6 SF-36 Health Survey

The SF-36 Health Survey is a 36-item questionnaire that assesses functional impairment and

symptoms due to medical health problems. It was developed for the Medical Outcomes Study,

and has been tested and validated extensively across a range of disorders.[88,103,115,137] It has

strong psychometric properties in some musculoskeletal and degenerative neck

conditions.[1,77,80,105] We used SF-36 Version 2.0 (SF-36v2) for acute injuries (1-week

recall).[136] The SF-36 is a generic measure that is intended to capture health-related quality of

life across a range of disorders. It consists of 8 domains (General Health, Physical Functioning,

Social Functioning, Role Physical, Role Emotional, Mental Health, Vitality, Bodily Pain) whose

item scores can be added up for a total SF-36 score, or assessed within the individual domains.

The SF-36 can be used to determine construct validity by comparing domains that are similar to

WDQ items.

63

4.2.2.7 Self-report Recovery

Self-reported recovery is widely used in clinical research as a global measure of improvement

following injury or disease.[93] We used a global recovery question that had seven response

options to the question: ”How well do you feel you are recovering from your injuries?”. The

seven response options were: 1. ‘completely better’; 2. ‘much improved’; 3. ‘slightly improved’;

4. ‘no change’; 5. ‘slightly worse’; 6. ‘much worse’; 7. ‘worse than ever’. This global recovery

question is reliable in acute WAD patients.[90] Self-reported recovery has been compared with

the SF-36 for minor musculoskeletal injuries and with neck pain NRS, Pain Disability Index and

CES-D in whiplash injuries.[18,93] Physical aspects of functional health status were more

strongly associated with self-reported recovery than were emotional or social aspects.

Incrementally poorer recovery ratings on the recovery question were also associated with greater

neck pain NRS, functional limitations, poorer physical health, depression and being off

work.[16]

4.2.3 Analysis

All statistical analyses were performed using SAS software (SAS 9.1 for Windows, SAS

Institute Inc., Cary, NC, USA).

4.2.3.1 Descriptive statistics

We examined the distributions of individual WDQ items and the total WDQ score to determine

the adequacy of the data for factor analysis and to determine which correlation statistic should be

used in construct validity. We also screened the data for missing values. Data distribution was

examined by visually inspecting plots, comparing the mean and median values as well as

examining the values of skewness and kurtosis for each item and the overall WDQ. Skewness

and kurtosis values between 0 and 1 were considered to demonstrate adequate normal

distribution of the data.

4.2.3.2 Factor Structure

We performed an exploratory factor analysis (EFA) of the WDQ. We considered inter-item

correlations to be adequate for factor analysis when correlations were between 0.3 and 0.7.[58]

64

Sampling adequacy of data for factor analysis was assessed by applying the Bartlett’s Test of

Sphericity and the Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy. We considered

the data adequate for factor analysis if the Bartlett’s Test of Sphericity had a significant p-value

(<0.05) rejecting the null hypothesis of an identity matrix, and if the KMO value was above 0.8

demonstrating sampling adequacy.[38,51,75]

Responses to the WDQ were subjected to an EFA using squared multiple correlations as prior

communality estimates. The maximum likelihood factor method was used to extract the factors

followed by a varimax (orthogonal) or promax (oblique) rotation. We used a combination of

several criteria to determine the number of factors to be retained: scree plot, proportion of

variance explained and clinical interpretability/meaningfulness.[58] The interpretability and

meaningfulness criterion meant that there had to be at least 3 items per factor that share a

conceptual meaning, that items loading on different factors measure different constructs, and that

the rotated factor pattern had a simple structure with no complex loadings.[58]

We used the likelihood ratio (chi-squared) test, root mean square residual (RMSR), the Akaike

information criteria (AIC), the Schwarz Bayesian criteria (SBC) and the Tucker and Lewis

Reliability coefficient (TLRC) to assess the goodness of fit of the models. We considered model

fit values adequate if they satisfied the arbitrary cut-points recommended for RMSR (< 0.08) and

[79,109] Finally, we considered models with a lower AIC and SBC values and

chi-square model fit values with significant p-values (at p<0.05) to have better fit.[79]

An item was considered to load on a factor if its factor loading was 0.40 for that factor and

<0.40 for the other factors.[58] We performed sensitivity analyses on the factor solution by: 1)

excluding items with the most missing values; 2) excluding items resulting in complex factor

loading; 3) imputing ‘0’ for missing values.

We measured internal consistency using Cronbach’s alpha for the overall scale and for each of

the factors. For items measuring one construct, Cronbach’s alpha was expected to be > 0.7.

[120,122] In cases where the Cronbach’s alpha level is > 0.9, item redundancy was considered,

and the number of items reduced accordingly.[120,122] For multi-construct instruments,

Cronbach’s alpha is not expected to satisfy these minimum values but should be assessed for

individual constructs.[117]

65

4.2.3.3 Validity

We developed a priori hypotheses for the expected correlations between the WDQ total score,

items or domains and the various measures of constructs relevant to whiplash-related symptoms.

Our hypotheses included correlation estimations with constructs for pain intensity (Numerical

Rating Scale), neck disability (Neck Disability Index [NDI] and Bournemouth Questionnaire),

mental health (CES-D), and general health (SF-36). We hypothesized that the WDQ should

correlate with other measures of neck-related disability (i.e. NDI, Neck Bournemouth

Questionnaire). Therefore, we tested this potential association with a hypothesis that the total

WDQ score would correlate strongly (i.e. r=0.6-0.8) with the Neck Bournemouth Questionnaire

and the NRS for the neck area, and it would correlate very strongly with the NDI (i.e. r=0.7-0.9).

Our hypothesis of very strong correlations with the NDI was based on the fact that the WDQ was

developed using NDI items. We also hypothesized that the WDQ daily activities items and daily

activities domain would correlate strongly (i.e. 0.6-0.8) with the Neck Bournemouth

Questionnaire, SF-36 physical function and role physical, the NDI and the neck/head NRS. The

total WDQ score was hypothesized to correlate moderately (i.e. 0.4-0.6) with the SF-36 physical

function and role physical, and NRS for the upper and lower limbs. Similarly, emotional items

and the WDQ emotional domain were hypothesized to correlate moderately (i.e. 0.4-0.6) with

the SF-36 mental health and role emotional, as well as the NDI and NRS of the upper and lower

limb. Finally, the total WDQ score was hypothesized to correlate poorly with the CES-D (i.e.

0.2-0.4) and poorly-to-moderately (i.e. 0.3-0.5) with the SF-36 mental health and role emotional

along with trunk region NRS.

We calculated Pearson’s correlations coefficients for normally distributed data and the

Spearman’s rank correlations coefficients were used for data that was skewed.

4.2.3.4 Responsiveness

We computed responsiveness based on a priori hypotheses of WDQ score changes using the

report of recovery on the global recovery question, or a change of 3/10 or more points on the

neck pain NRS as external anchors. We hypothesized that participants with acute WAD would

demonstrate change over the six-week period using responsiveness statistics and associations

between recovery and WDQ scores based on a priori hypothesized correlations and the receiver

operator characteristics curve approach. In accordance with previous research, we considered a

66

change of 3 or more points on an 11-point NRS to be a moderate improvement that would

demonstrate responsiveness.[37] The report of recovery on the 6-week global recovery question

was defined by dichotomizing response options. Responses were dichotomized as recovered if

the options ‘completely better’ or ‘much improved’ were selected.[90] They were classified as

not recovered if any of the other response options were selected (including the ‘slightly

improved’, ‘no change’, ‘slightly worse’, ‘much worse’ and ‘worse than ever’ responses). Our

definition excluded the ‘slightly improved’ response option from the recovered group. This is in

accordance with previous research which suggested that excluding categories adjacent to the ‘no

change’ category may provide the most accurate definition of recovery.[74] By excluding

participants who may be very close to reporting no change, our dichotomization of recovery may

provide the least contaminated groups of patients who have recovered clearly differentiated from

the non-recovered group. Previous studies have used the same dichotomous definition of global

recovery.[18]

We assessed responsiveness by calculating the effect size, Guyatt’s responsiveness statistic and

the standardized response mean defining recovery using the recovery question.[23,54,68] Effect

size has been defined as the difference between mean scores at baseline and follow-up divided

by the standard deviation of baseline scores.[23,68] The standardized response mean uses a

similar ratio except that it uses a standard deviation of the change scores in the denominator.[68]

Guyatt’s responsiveness statistic, on the other hand, uses a different numerator. The numerator

is supposed to represent the smallest difference between baseline and follow-up scores that has a

meaningful benefit (such as a minimal clinically important difference (MCID)).[54,68] In the

absence of a proposed MCID, Guyatt et al suggested that a mean change can also be used.[54]

The Guyatt responsiveness statistic also differs from the other statistics because the denominator

is computed based on the error estimated from a sample of stable patients.[54,144]

Some authors have suggested that the relative size of change for responsiveness statistics can be

categorized into small, medium and large effect size.[144] However, these categorizations are

only helpful for head-to-head comparisons between tools. The magnitude of the statistic can

depend on several other factors such as intervening time or intervention. We considered

evidence of responsiveness to be demonstrated based on our a priori hypotheses without reliance

on the magnitude of the statistic. We presented all three statistics because different indices of

responsiveness may provide different results or a different responsiveness rank order.[144]

67

We also hypothesized that strong positive associations >0.8 would exist between a report of

recovery (i.e. using the recovery question or the NRS) and the summative total WDQ change

score between baseline and six weeks. We hypothesized that those participants reporting

recovery would demonstrate the highest WDQ change scores over the six-week period (i.e.

highest decrease in disability scores). For the two domains (daily activities and emotional),

moderate positive associations >0.5 were expected within the participants reporting recovery.

Finally, we demonstrated responsiveness using the receiver operator characteristic (ROC) curve

approach.[34] We reported AUCs for dichotomous variables because they provide a measure of

the instrument’s ability to discriminate between participants who improved and those who have

not, based on the external anchor.[33] Change thresholds in WDQ scores were plotted against

their ability to discriminate between participants who recovered and those who have not (using

change in neck pain NRS and the recovery question as external anchors). Sensitivity and 1-

specficity were plotted for each change threshold and the area under the curve was calculated.

De Vet et al have suggested that an area under the curve (AUC) of at least 0.7 is suggestive of

good discriminative ability, and hence good responsiveness when a reasonable external anchor

(criterion indicator) is used.[33] While this threshold is reasonable for the overall WDQ and the

physical domain, we hypothesized that the emotional domain will demonstrate change over six

weeks that is less than the threshold proposed by De Vet et al (i.e. AUC = 0.6).

4.2.4 Sample Size

The sample size required to adequately perform an exploratory factor analysis is at least 5-10

times the number of participants to the number of variables being analyzed.[58,84] The WDQ

has 13 items, and we considered a sample size of 130 participants to be adequate for analysis.

4.3 Results

4.3.1 Sample characteristics

We enrolled 130 participants with acute WAD. Of those, 91 (70%) were female and mean age

was 42.1 (SD= 13.2) years [Table 4.1]. Participants were enrolled a mean of 6.5 days post-injury

68

(SD=4.9). Thirty-four (26%) participants had Grade I WAD, 95 (73%) had Grade II, and one

(0.8%) had Grade III. The majority of participants (87.8%) had education that was higher than

high school and almost half (47.7%) had an income of less than $50,000. Two participants

(1.5%) had lawyers involved in the claim. The mean WDQ score was 49.8 (SD=29.1) at

baseline (n=130). The NDI was completed by 125 participants with a mean score of 17.5

(SD=8.1) and 129 participants completed the Neck Bournemouth Questionnaire with a baseline

mean score of 30.6 (SD=16.4). The SF-36 was completed by all participants with mean

standardized scores of 62.4 (SD=26.0) for physical functioning and 68.7 (SD=21.5) for the

mental health domain. Finally, the CES-D was also completed by all participants with a mean

score of 20.2 (SD=7.0).

Seventy eight percent (n=101) of participants responded to the six-week follow-up telephone

interview. However, one participant did not respond to the WDQ questionnaire. The mean

WDQ score was 32.0 (SD=31.2). At six weeks, 15 participants (14.9%) reported being

‘completely recovered’, 48 (47.5%) reported being ‘much improved’, 30 (29.7%) were ‘slightly

improved’, three (3%) reported ‘no change’ and five (5%) got worse.

69


disorders.

Characteristic N Baseline

Female, no. (%) 130 91 (70.0) Age, years Mean (SD); range 130 42.1 (13.2); 19.6-81.6 Time since injury, days Mean (SD); median; range 130 6.5 (4.9); 5.00; 0-25 WAD grade 130 I, no. (%) 34 (26.2) II, no. (%) 95 (73.1) III, no. (%) 1 (0.8) WDQ Total Score, Mean (SD); median; range 130 49.8 (29.1); 46; 2-119 Neck Disability Index, mean (SD); median; range 125 17.5 (8.1); 16; 0-41 Neck Bournemouth Questionnaire, mean (SD); median; range

129 30.6 (16.4); 28; 2-65

SF-36 Physical Functioning, mean (SD); median; range

130 62.4 (26.0); 65; 0-100

SF-36 Mental Health, mean (SD); median; range 130 68.7 (21.5); 75; 10-100 CES-D, mean (SD); median; range 130 20.2 (7.0); 18; 8-42 Highest level of education, no. (%) 130 High school or less 17 (12.3) Post secondary or some university 43 (33.1) Technical school graduate 21 (16.2) University graduate 50 (38.5) Income, no. (%) 129 $0-$49,999 62 (47.7) $50,000-$59,999 17 (13.1) $60,000-$79,999 21 (16.2) $80,000+ 27 (20.8) Did not respond 2 (1.5) Lawyer Involvement in the Claim (%) 130 2 (1.5) Pain Intensity, Mean (SD)*; Median 130 Neck 5.5 (2.0); 5.5 Shoulder 4.6 (2.8); 5.0 Low Back 3.9 (3.3); 4.0 Headache 3.7 (3.2); 4.0 Arm 2.2 (2.7); 0 !! !! !!

70

4.3.2 Data completion

Responses to the WDQ demonstrated few missing values. The item with the highest number of

missing values was the sporting activities item (19 missing out of 130 at baseline and 14 out of

100 at follow up) [Table 4.2]. At baseline, 22 participants had one missing item and one

participant had 2 missing items. Distributions of individual items at baseline are provided in

Appendix 3. Table 4.2 and distributions in Appendix 3 demonstrate that the emotional items and

items related to social activities and driving or using public transportation may contribute less to

the overall WDQ score because they are more skewed toward no disability. The emotional

subscale and its items demonstrate a floor effect.

At follow up, 13 participants had one missing item and two participants had two missing items.

There were no participants with more than 2 missing items.

Table 4.2: Baseline means, medians and normality values of WDQ total score and individual

items

Variable N Mean Std Dev

Median % min

% max

Kurtosis Skewness

Pain 130 5.7 2.1 6.0 0 2 -0.6 -0.2 Personal care 130 3.0 2.9 2.0 33 0 -1.0 0.6 Work/home/study duties 129 4.8 3.1 5.0 12 8 -1.2 0.0 Driving/using public trans 127 3.8 3.0 3.0 18 2 -1.2 0.3 Sleep 130 4.8 3.4 6.0 21 5 -1.4 -0.1 Fatigue/tiredness 130 5.4 2.9 6.0 9 6 -0.8 -0.4 Social activities 129 3.5 3.2 3.0 29 2 -1.2 0.5 Sporting activities 111 6.3 3.5 7.0 10 26 -1.1 -0.5 Non-sporting leisure activities 130 3.1 3.0 2.5 33 2 -0.9 0.6 Depression/sadness 130 2.3 3.0 1.0 48 1 -0.2 1.1 Anger 130 2.2 3.0 0.0 54 2 0.1 1.1 Anxiety 130 3.2 3.0 2.0 27 2 -1.0 0.6 Concentration 130 2.9 3.0 2.0 36 2 -0.8 0.6 Summative WDQ score 130 49.8 29.1 46.0 0 0 -0.7 0.4 Daily activities subscale score 130 39.2 21.0 38.0 0 0 -0.8 0.2 Emotional subscale score 130 10.6 10.2 7.0 47.5 0 -0.2 0.9

71

4.3.3 Factor structure

The total WDQ score distribution satisfied the normal distribution assumption (skewness=0.4;

kurtosis=-0.7) (Figure 4.1). Most of the individual items were skewed toward less disability (i.e.

means and medians being mostly below 5/10) [Table 4.2]. Ninety-six percent of inter-item

correlations were above 0.3 and 97.4% were below 0.7. The Bartlett’s Test of Sphericity and

Kaiser-Meyer-Olkin measure supported that the inter-item correlations were adequate for factor

analysis with a chi-squared value of 116.3 (p<0.0001) and the KMO value of 0.92.

Figure 4.1: Total WDQ baseline distribution

EFA was performed on baseline data collected from 107 participants with complete WDQ scores

(i.e. no missing values on the correlation matrix). The proportion of variance explained, the

Scree plot and clinical interpretability suggested that two factors should be retained [Figure 4.2].

However, the model fit criteria suggested that a three-factor model was appropriate but the three-

factor model resulted in complex loading of the concentration item (WDQ13) [Table 4.3].

Therefore, the three-factor model failed to satisfy the clinical interpretability criteria that we set a

72

priori, which required a simple factor solution with no complex loading of items (i.e. no items

loading on more than one factor in the final solution).

!"#$$%&'()%(*%+,-$./0'1$2%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%3%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%4565%7%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%3%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%3%%%%%%%%%%%8%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%3%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%3%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%896:%7%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%3%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%3%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%3%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%3%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%8:65%7%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%3%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%3%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%3%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%3%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%846:%7%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%3%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%+%%%%%%3%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%,%%%%%%3%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%-%%%%%%3%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%$%8565%7%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%.%%%%%%3%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%/%%%%%%3%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%0%%%%%%3%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%'%%%%%%3%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%1%%96:%7%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%$%%%%%%3%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%2%%%%%%3%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%3%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%3%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%:65%7%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%3%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%3%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%3%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%3%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%46:%7%%%%%%%%%%%%%%%%%4%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%3%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%3%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%3%%%%%%%%%%%%%%%%%%%%%%%;%%%%%<%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%3%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%:%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%565%7%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%=%%%%%9%%%%%>%%%%%?%%%%%5%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%3%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%8%%%%%4%%%%%;%%%%%%%%%%%%3%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%3%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%3%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%@46:%7%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%ABBBBB7BBBBB7BBBBB7BBBBB7BBBBB7BBBBB7BBBBB7BBBBB7BBBBB7BBBBB7BBBBB7BBBBB7BBBBB7BBBBB7BBBBB%%%%%%%%%%%%%5%%%%%8%%%%%4%%%%%;%%%%%<%%%%%:%%%%%=%%%%%9%%%%%>%%%%%?%%%%85%%%%88%%%%84%%%%8;%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%C1DE$#%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Figure 4.2: Factor analysis scree plot

Using promax rotation, we determined that two factors should be retained with nine items

loading on the daily activities factor and four loading on the emotion factor [Table 4.4].

Specifically, depression/sadness, anger, anxiety and concentration items group together

conceptually and statistically to describe the emotional subscale of the WDQ [Table 4.4]. The

nine items loading on the other factor share a conceptual meaning of activities involved in daily

life along with pain and fatigue/tiredness, which can be considered closely associated with daily

life activities. Personal care, work/home/study duties, driving/using public transportation, sleep

and social, sporting and non-sporting leisure activities share the conceptual meaning of daily

activities. Statistically and theoretically, they group well with pain and fatigue/tiredness because

individuals with pain or fatigue often have activity limitations caused by pain/fatigue, but who

73

may also get more pain/fatigue with activities. In a practical sense, it may often be difficult to

determine if pain/fatigue caused the activity limitation, or if the limitation is the cause of the

pain/fatigue. Statistically, these concepts group together with strong correlations to describe the

daily activities subscale of the WDQ [Table 4.4]. The inter-item correlation between the two

factors (daily activities, emotional) in the promax rotation was 0.63. This high correlation

suggests that emotions are associated with daily activity limitations in adults with acute WAD

and should not be ignored when measuring outcomes in research studies and in clinical practice.

Table 4.3: Model fit statistics for the models with different number of factors in the WDQ

# of Factors LR c2 df P-value RMSR AIC SBC TLRC One 189.4 65 <0.0001 0.082 70.4 -103.3 0.82 Two 109.2 53 <0.0001 0.053 10.3 -131.4 0.90 Three 66.3 42 0.0098 0.037 -12.9 -125.2 0.94

* LR=Likelihood Ratio; df=degrees of freedom; RMSR=root mean square residual; AIC=Akaike information criteria; SBC=Schwarz Bayesian criteria; TLRC=Tucker and Lewis Reliability coefficient

We performed sensitivity analyses on the factor solution. First, we excluded the sporting item

since it was the item with the most missing items. The two-factor solution remained stable with

this exclusion (n=125). Second, we excluded the concentration item because it resulted in a

complex 3-factor solution by loading on two factors. The two-factor solution remained stable on

removal of the concentration item (n=107). Finally, we imputated ’0’ for missing values as

suggested by the WDQ developers, and the imputation did not alter the two-factor solution

(n=130).

Cronbach alpha coefficient was 0.93 overall (n=107), 0.92 for the activity limitation (n=107) and

0.88 for the emotion factor (n=130). Imputing zero for missing values resulted in adjusted

Cronbach alphas of 0.93, 0.91 and 0.88, respectively.

74

Table 4.4: Factor analysis of the WDQ: The 2-factor solution

Variable* Factor Pattern Factor Structure Factor Daily

Activities Emotional Daily

Activities Emotional

1 Pain .61 .11 .68 .49 2 Personal care .69 .03 .71 .47 3 Work/home/study duties .85 .00 .85 .54 4 Driving/using public

transportation .60 .23 .75 .61

5 Sleep .59 .21 .72 .58 6 Fatigue/tiredness .46 .32 .67 .61 7 Social activities .69 .18 .80 .61 8 Sporting activities .83 -.17 .72 .35 9 Non-sporting leisure

activities .72 .12 .79 .57

10 Depression/sadness .20 .72 .65 .84 11 Anger -.04 .67 .38 .64 12 Anxiety -.09 .94 .50 .88 13 Concentration .28 .56 .63 .73

4.3.4 Validity

Our analysis suggested that the WDQ has adequate construct validity. A priori theorized strong

Pearson’s correlations were met for anticipated relationship between the WDQ and the NDI,

Bournemouth questionnaire, SF-36 physical function and numerical pain rating scales (for the

neck, shoulder, mid and low back pain) [Table 4.5]. Moderate correlations (as theorized) were

found for the CES-D and the SF-36 mental health domain [Table 4.5]. The NRS scores for

abdomen, hand, leg, foot and face pain intensity and the SF-36 mental health and role emotional

subscales scores were not normally distributed. Therefore, these correlations were computed

using Spearman’s rank correlations. All other distributions were normal and their correlations

were reported as Pearson’s correlations.

75

Table 4.5: Results of construct validation (n=130). A priori expected Pearson correlations

between the WDQ, its subdomains and constructs shown (E) followed by observed/achieved

results (A).

WDQ N Overall Domain daily

activities Domain emotional

Neck Bournemouth E 0.6-0.8 0.6-0.8 - Questionnaire A 129 0.89 0.86 SF-36 Physical function E 0.4-0.6 0.6-0.8 - A 130 0.72 0.74 SF-36 Role Physical E 0.4-0.6 0.6-0.8 - A 130 0.68 0.72 SF-36 Mental Health* E 0.3-0.4 - 0.4-0.6 A 130 0.58 0.70 SF-36 Role emotional* E 0.3-0.4 - 0.4-0.6 A 130 0.57 0.66 CES-D E 0.2-0.4 - 0.6-0.8 A 130 0.67 0.73 NDI E 0.7-0.9 0.6-0.8 0.4-0.6 A 125 0.80 0.82 0.60 NRS upper limb E 0.4-0.6 0.4-0.6 0.4-0.6 NRS shoulder A 130 0.49 0.51 0.33 NRS arm* 130 0.42 0.37 0.40 NRS hand* 130 0.26 0.21 0.29 NRS lower limb E 0.4-0.6 0.4-0.6 0.4-0.6 NRS leg* A 130 0.22 0.23 0.15 NRS foot* 130 0.24 0.26 0.21 NRS trunk E 0.3-0.5 0.3-0.5 0.3-0.5 NRS midback A 130 0.48 0.46 0.42 NRS low back pain 130 0.40 0.37 0.35 NRS abdominal* 130 0.30 0.29 0.27 NRS neck/head E 0.6-0.8 0.6-0.8 0.6-0.8 NRS neck A 130 0.64 0.65 0.48 NRS head* 130 0.44 0.45 0.36

All correlations significant at p<0.05 (except between leg pain and emotional domain (p=0.097)) E=expected; A=achieved * Spearman rank correlations reported due to distributions skewed toward a value of zero

76

4.3.5 Responsiveness

The effect size, Guyatt’s responsiveness statistic and SRM for recovered participants (n=62)

demonstrated change in WDQ scores at 6 weeks for the overall WDQ and daily activities

subscale, but less for the emotional domain [Table 4.6]. As described in the methods, the exact

values of the responsiveness statistics should not be used to demonstrate severity of change since

arbitrary cutoff points are not useful. However, all responsiveness statistics demonstrate an

improvement. The mean change in scores also reflects that improvement in the overall WDQ

score and each of the subscales over 6 weeks [Table 4.6].

Table 4.6: Effect size, Guyatt’s responsiveness statistic (RS) and standardize response mean

(SRM) for participants reporting recovery on the global recovery question (N=62)

Variable N Mean SD SD** Effect Size

RS SRM

Baseline total WDQ score 101 47.61 29.34 Follow-up total WDQ score 100 31.95 31.15 Baseline daily activities score 101 37.25 21.18 Follow-up daily activities score 100 23.99 22.83 Baseline emotional score 101 10.37 10.26 Follow-up emotional score 100 7.96 9.86 Change in total WDQ score 100 15.99 23.12 19.76 0.78 1.16 1.04 Change in daily activities score 100 13.48 17.9 21.13 0.92 0.92 1.14 Change in emotional score 100 2.51 7.52 6.08 0.34 0.57 0.49

SD = standard deviation; SD** = standard deviation of the group reporting no change

The total WDQ and daily activities subscale had moderate correlations with the global recovery

question at six weeks and with the change in NRS neck pain over the six-week follow-up [Table

4.7]. For the same comparisons, the emotional subscale demonstrated poor correlations. The

total WDQ score and its subscales did not reach a priori hypothesized correlations with the 7-

category global recovery question or the change in 11-cateogory NRS neck pain continuous

scores [Table 4.7]. A priori hypothesized correlations between the global recovery question at

six weeks, a change in pain intensity and the change in WDQ over six weeks were not achieved

77

for the dichotomized variables [Table 4.7]. However, AUCs for the total summative WDQ

change scores and the daily activities domain demonstrated a value over 0.7 for both the

recovery question and the change in NRS neck pain of three or more points, thus demonstrating

adequate responsiveness [Table 4.7]. The emotional domain AUC was below the hypothesized

0.6 for all comparisons.

Table 4.7: Spearman’s rank correlations and AUCs for responsiveness based on the a priori

hypotheses

WDQ

N

Overall Domain daily

activities Domain

emotional Continuous variable comparisons Global recovery

7-category recovery question at 6 weeks

Corr 100 -0.41; p<0.0001

-0.41; p<0.0001

-0.19; p=0.057

Change in neck pain

Change in 11-point NRS for neck pain

Corr 100 0.54; p<0.0001

0.54; p<0.0001

0.27; p=0.007

Dichotomized external anchor comparisons

Recovery status

Completely recovered vs other responses of the recovery question

Corr# 100 0.29 0.30 0.17

AUC Recovered

n=15 * 0.68 0.50

Completely better and Much Improved vs other responses of recovery question

Corr# 100 0.52 0.57 0.24

AUC Recovered

n=62 0.73 0.75 0.50

Pain severity

NRS neck pain change of 3 or more

Corr# 100 0.50 0.54 0.26

AUC Recovered

n=47 0.73 0.76 0.60

*unable to construct; # Biserial Correlation

78

!

4.4 Discussion We found that in adults with acute WAD, the WDQ consists of two factors or subscales: daily

activities and emotional. Our sensitivity analyses demonstrated that the factor structure is stable

when we excluded the item with the most missing values (i.e. sporting activities) or the item that

had a complex loading (i.e. concentration item) in a three-factor model. Our two-factor model

was not affected by missing values. The total summative score and the two subscales of the

WDQ had good internal consistency. Our analysis also demonstrated that the WDQ had good

construct validity. Specifically, strong correlations were found between the total WDQ score and

the NDI, the Bournemouth questionnaire, the SF-36 physical function and the numerical pain

rating scales (for the neck, shoulder, mid and low back pain). Moderate correlations were also

demonstrated with the CES-D and the SF-36 mental function. Finally, the confirmation of our a

priori mini-theories supports that the overall summative score and the physical activities

subscale of the WDQ were responsive. However, responsiveness was not established for the

emotional subscale of the WDQ.

Previous work by Pinfold et al. found that the WDQ had only one factor in patients with chronic

WAD.[99] While it is possible that the WDQ has different factor structures in acute and chronic

populations, this finding may also be attributable to the analytical methods used to derive the

factors. Pinfold et al. used principal component analysis (PCA) instead of factor analysis.

Statistically, the two methods differ since the goal of PCA is to account for the total variance in

the sample.[40,50,58] PCA does not differentiate between the common and unique variance and

defines principal components as linear combinations of measured variables. In contrast, the

common factor model used in factor analysis assumes that total variance can be divided into

common and unique variance of each variable. It uses only common variance in determining the

number of factors to retain, and the measured variables are linear composites of estimated latent

factors in EFA. This can lead to different results because PCA uses total variance to estimate the

number of factors to extract from the solution, while EFA uses only the common variance of

each variable (i.e., a portion of the total variance that is shared among variables).[40,50,58]

Furthermore, PCA is mainly recommended as a data reduction technique; whereas, factor

79

analysis is appropriate for both data reduction and for determining the factor structure of an

instrument.[50,58] Therefore, we chose to use EFA instead of PCA.

Compared to our results, a higher internal consistency (alpha=0.96) for the total summative scale

was previously reported in chronic WAD. This finding may be attributable to the homogeneity

of the chronic population.[99] However, the authors did not report a high inter-item correlation

(>0.85), which would have suggested item redundancy.[99,120,122] Based on our results, the

WDQ has two subscales each with adequate internal consistency and no item redundancy. The

two-factor WDQ also demonstrated adequate overall internal consistency in this sample of acute

WAD.

Our results demonstrated construct validity in participants with acute WAD by confirming our a

priori mini-theory correlations between the WDQ and other relevant constructs. Strong

correlations with other physical ability and pain constructs as well as moderate correlations with

the emotional constructs suggests that the WDQ is valid at capturing the full spectrum of

symptoms relevant to WAD. To our knowledge, the construct validity of the WDQ has not been

previously assessed. Face validity was previously found to be reasonable in chronic WAD by a

medical committee panel (i.e. practitioners involved in musculoskeletal rehabilitation, clinical

psychology and psychiatry).[99] No changes to the questionnaire were requested by this

multidisciplinary panel for the population with chronic WAD during the development of the

WDQ.[99]

Our responsiveness results suggest that the total WDQ and the daily activities subscale

demonstrate change over 6 weeks, but that the emotional domain is not responsive in acute

WAD. The lack of responsiveness in the emotional domain may be due to a lack of change in

emotion (i.e. depression, anxiety) over a 6-week period. With a floor effect of almost 50%, the

emotional subscale does not have the range of symptom severity to demonstrate change since

there is no emotional disability from the onset in this sample of acute WAD. A previous study

reported an adequate correlation between the total WDQ score and a transition recovery question

over one month in chronic WAD (Spearman’s r=0.67).[140] Our lower correlations in acute

WAD were potentially due to the external anchor which had fewer response options. Our global

recovery question was a 7-response option Likert scale while Willis et al’s question was an 11-

category Likert scale. Although a 7-response option Likert scale is often considered adequately

80

continuous for research purposes, an 11-category Likert scale with more points in the score is a

more continuous scale when tested statistically. For dichotomous variables, we reported AUCs

because they provide a measure of the instrument’s ability to discriminate between participants

who improved and those who have not, based on the external anchor.[33] Our results

demonstrate satisfactory AUCs (AUC>0.7) for the total summative WDQ score and the daily

activities, but not the emotional subscale. To our knowledge, there are no other reports of AUCs

for WDQ responsiveness. Willis et al also reported the effect size (ES=0.02) and the SRM

(SRM=0.05) demonstrating that there was minimal change in WAD disability in their overall

study sample over one month (n=52).[140] We assessed similar responsiveness statistics to

Willis et al. for those reporting improvement on our recovery question using the recovery status

definition (i.e. the top 2 recovery response options) in acute WAD (n=62). In contrast to Willis

et al, we found that all three responsiveness statistics demonstrated change over six weeks in

acute WAD for those reporting change. However, their assessment was performed in a chronic

population and no change was expected.[140] Responsiveness statistics were not assessed for

the participants reporting worsening since deterioration was not a frequent occurrence (n=8) as

expected. Caution should be used in demonstrating responsiveness on the WDQ because a

change of one sixth of the scale was needed to demonstrate significant change.

Our study had several strengths. We had a large sample of participants with acute WAD

recruited within 21 days of their injury. We also had few missing data and a follow-up rate of

almost 80%. However, our results may be limited by several factors. Our EFA demonstrated two

factors. It is possible that a different set of decision criteria to determine a factor structure may

have lead to different results (i.e., a minimum factor loading of 0.35 instead of 0.4 may result in

a different number of factors extracted from the factor solution). However, our decisions on

factor structure were based on a priori determined criteria, and we applied standards that are

commonly used and recommended in the field in terms of such criteria.[58] Missing values may

also lead to different factor structures. However, we performed sensitivity analyses and the factor

structure remained stable. There is no criterion for measuring WAD disability. Therefore, we

used a priori hypotheses to determine construct validity, which is the acceptable method of

assessing validity.[33] It is possible that other instruments measuring constructs relevant to

WAD disability (e.g., Hamilton Rating Scale for Depression, or the Northwick Neck Pain

Questionnaire) may provide different results; however, we used validated and common self-

81

report outcome measures for WAD that are representative of the relevant constructs. There is

controversy on how responsiveness should be assessed and determined.[33] We reported

correlations, AUCs and various responsiveness statistics using a priori hypotheses in order to

confirm changes in WAD disability over six weeks experienced by our participants (and

expressed through external anchors of change). Although, the total WDQ and daily activities

subscale were responsive, the total WDQ still required a change of at least one sixth of the scale

over six weeks to demonstrate change beyond the daily variability of individuals reporting no

change. Therefore, clinicians and researchers should be cautious in using the total WDQ change

scores below 22 points.

We recommend that the daily activities subscale and the total summative WDQ be used.

However, the emotional subscale on its own is not responsive to change in participants with

acute WAD and should be used with caution until it undergoes more rigorous testing in a subset

of people expected to have more change in emotional functioning than our sample had. Future

research should address the lack of responsiveness of the emotional subscale and suggest

potential modifications that could improve the response to emotional factors in WAD disability.

4.5 Conclusion Our results demonstrate that the WDQ has two factors (daily activities and emotional) and

adequate construct validity. While the WDQ demonstrated responsiveness to recovery over six

weeks, a change of almost one sixth of the WDQ (MDC=22 points) is required to demonstrate

change beyond the daily variability of individuals reporting no change. Furthermore, the

emotional subscale was not able to detect change during the first six weeks of the acute phase of

WAD. The WDQ can, therefore, be used in clinical settings to determine disability status and to

demonstrate change over time. However, only the overall score and the daily activities subscale

should be used to demonstrate change over time since the emotional subscale was not responsive

in this sample.

82

4.6 Acknowledgement This study was funded by an industry grant from AVIVA Canada Incorporated to the University

Health Network for the UHN Whiplash Intervention Trial. Maja Stupar was funded by a Vanier

Canada Scholar Canadian Institutes of Health Research award. The authors declare no conflicts

of interest.

83

Chapter 5 :

Discussion

5.1 Context and summary of the thesis The primary focus of this thesis was to investigate the measurement properties of a disability

outcome measure, the Whiplash Disability Questionnaire (WDQ), in a population of adults with

acute WAD. Specifically, I used classical test theory to determine the test-retest reliability,

construct validity, factor structure and responsiveness of the WDQ.[33] This is the first

evaluation of the WDQ in injured adults within 21 days of their motor vehicle collision. The

WDQ had only been validated in patients with chronic WAD; its applicability to adults with

acute WAD remained unknown. It is important to validate this instrument for use in acute

injuries because accurate measurement of outcomes early after a traffic collision and during

treatment may assist in preventing chronicity.

The WDQ is a comprehensive instrument compared to other outcome measures currently used in

clinical research and clinical practice to monitor individuals with WAD.[63] A comprehensive

instrument is preferable because it has better content validity than a less comprehensive

instrument.[125] Proper evaluation of disability throughout the course of whiplash injuries is

necessary to study treatment effectiveness and prevent chronic disability.

Different methodologies are available to assess the measurement properties of outcome

measures. A close look at these methodologies (and at their differences) led to the conceptual

paper (Chapter 2) that focused on resolving the debate between clinimetrics and psychometrics.

I found that differences were more prevalent for the development of outcome measures while the

evaluation involved similar methods for both clinimetrics and psychometrics.

When examining the measurement properties of an instrument, it is important to first establish

the instrument’s reliability and then its validity and responsiveness.[78] The WDQ was

developed with an evaluative purpose meaning that it was developed to measure the magnitude

of change in symptoms/disability over time.[78] I found that the WDQ is reliable in acute

whiplash injuries (Chapter 3) which led me to assess its factor structure, construct validity and

responsiveness. The exploratory factor analysis indicated that WDQ has two factors: daily

84

activities and emotional subscales (Chapter 4). Therefore, the construct validity and

responsiveness were assessed using both the total summative scale and the subscales. The WDQ

had reasonable construct validity for the total summative and subscale scores but the emotional

subscale did not demonstrate adequate responsiveness (Chapter 4). Finally, the reliability of the

subscales was evaluated posteriori and demonstrated adequate reliability. In the 66 participants

reporting minimal to no change in their whiplash symptoms over 3-5 days, the intra-class

correlation coefficient (ICC) for the daily activities subscale was 0.85 (95% CI 0.80-0.90) and

0.87 (95% CI 0.82-0.91) for the emotional subscale. The results of subscale reliability complete

the validation of the WDQ in acute WAD and link the results of Chapters 3 and 4.

5.2 Contribution of the research to the whiplash literature

The construct of disability is difficult to measure in WAD patients because it cannot be measured

with biological or patho-physiological measures. Instead, this construct is mostly measured

using self-reported outcome measures that can be perceived to contain bias due to the subjective

nature of the reporting of the outcome. In order to standardize the definition of disability, the

World Health Organization (WHO) developed the International Classification of Functioning,

Disability and Health (ICF).[121,142] According to this classification, disability ‘serves as an

umbrella term for impairments, activity limitations and participation restriction’ and this includes

environmental and factors that can interact with these constructs.[142] Therefore, self-reported

outcome measures should be multi-faceted to capture all domains relevant to the definition of

disability as proposed by the ICF. However, in the WAD literature, the construct of recovery is

often equated to the absence of neck pain. This is problematic because WAD is a much broader

construct than neck pain.[59,134] Furthermore, WAD recovery lacks a standardized definition

which partly contributes to reporting of varying recovery rates in research. The WDQ was

recently identified as the most rigorously-developed whiplash-specific disability questionnaire

currently available to monitor patients because of its comprehensive scope.[134] Developers of

the WDQ combined the use of the ICF, expert opinion and a problem elicitation technique to

interview WAD patients to identify items for inclusion in the WDQ.[99,134]

My thesis provides validation of this promising tool. Its use may improve the measurement of

WAD disability, which will help standardize the measurement of disability in future research.

85

Standardizing the measurement of disability in acute WAD is important to document the impact

of whiplash injuries on patients’ activities. This measurement is also important to study the

clinical course of WAD and identify those who may be at risk of developing chronic disability.

While my thesis was underway, the COSMIN research group published a consensus document

on measurement terminology and proposed a list of evaluation criteria for the critical appraisal of

measurement studies.[125] Included in the criteria are items on adequate sample size, reporting

and handling of missing data that are relevant to all studies assessing measurement properties.

The criteria also included property specific items such as adequate follow-up period for test-

retest reliability and setting appropriate a priori hypotheses for construct validity and

responsiveness (the complete COSMIN criteria checklist is available online at

http://www.cosmin.nl/the-cosmin-checklist_8_5.html). It is important to note that the COSMIN

criteria have not yet been evaluated. However, the methods used in my thesis satisfy most of the

criteria suggested by this group of clinimetricians (Appendix 4).

Development of standardized criteria by the COSMIN group may lead to standardization of

methods and improvement of the quality of published research in the measurement field.

However, some of their criteria rating levels are arbitrary and may need to be modified during

the validation process. For example, the COSMIN group used a rule of thumb for evaluating

quality of a study (Appendix 4).[125] This arbitrary criterion rating level does not take into

account that sample sizes vary according to the research questions and the parameters to be

estimated.[11,133]

While developing the evaluative criteria, the COSMIN research group also performed a literature

review of neck pain and disability measures, applied their criteria, and found that methodology

and reporting of literature on this topic could be improved in several areas.[126] Specifically,

Terwee et al reported that the most important methodological aspects that need improvement

included assessing unidimensionality in internal consistency analysis and using stable patients

and similar test conditions in studies on reliability and measurement error. Furthermore, they

suggested that more emphasis should be placed on the relevance and comprehensiveness of the

items in content validity studies and that construct validity and responsiveness studies should be

based on predefined hypotheses.[126] Therefore, this thesis has contributed methodologically

86

strong results on the measurement properties of the WDQ for acute injuries. Consequently, the

WDQ can be used in clinical and research settings as a more comprehensive condition-specific

outcome measure in whiplash injuries compared to the current commonly used neck-specific

measures. Although the responsiveness of the emotional subscale needs to be improved, the

WDQ has adequate measurement properties for use in clinical and research settings. Moreover,

its use will decrease the burden on patients and research participants by providing one short

instrument to assess a construct that would otherwise require multiple instruments.

5.3 Implications of the research The results of this thesis have implications for several stakeholders including clinicians,

researchers, health policy makers, insurers and most importantly patients. The main implication

is that the WDQ, a comprehensive, condition-specific outcome measure, has measurement

properties that support its use in patients with acute WAD. This compliments already published

information on its validity in chronic WAD. A questionnaire that is valid in different stages of a

disorder can be used to demonstrate significant changes in disability status over the full duration

of the condition and is not limited to just cases that have become chronic. In turn, the WDQ can

be helpful in studying how to prevent chronic WAD disability by assisting with the identification

of effective treatments.

Clinicians and researchers must be cautious about the responsiveness of the WDQ in acute WAD

patients, specifically when using the emotional subscale. We demonstrated responsiveness using

within-person and between-person scores for those reporting change using an external anchor.

This makes the responsiveness results relevant clinically (where within-person change is

assessed) and in research or for public health purposes (where between-person change is often

relevant such as differences between recovered and unrecovered groups). Furthermore, the

WDQ is reliable for evaluative purposes in research settings (i.e. adequate ICC for group-level

analysis) and in clinical settings (i.e. MDC of 22 points for changes in individual patients).

Aside from the limitations within the emotional subscale responsiveness, the WDQ is reliable,

valid and responsive for use in research and in clinical settings.

87

Validated instruments are necessary for accurate measurement of treatment effectiveness in

research and clinically. Without validated outcome measures, research results may be biased and

this bias can have a major impact on health policy as inaccurate research results can translate into

flawed health policies. Research results and health policy both influence the administration of

insurer coverage for injuries following motor vehicle collisions. Therefore, validated

instruments are the necessary building blocks for effective treatment of patients with WAD

because they help develop sound research results, effective health policy and effective insurance

policy.

5.4 Future research The results of this thesis have led to several questions that should be answered in future research

studies. These include questions resulting from both the quantitative assessment of the WDQ

measurement properties and the conceptual measurement paper.

5.4.1 Content validity using qualitative methods

The item with the most missing values inquired about ‘sporting activities’. The relevance of this

item needs to be established in patients with acute WAD. For example, a participant with an

acute injury may not have had time to attempt a sporting activity that may lead to missing values

because of the time of administration. The meaning of the sporting activity item needs to be

reconsidered. In Australia, where the WDQ was developed, definition of sport participation

broadly includes participation in organized sport and non-organized sport plus physical

activities.[69] In Canada, sport is defined as an organized physical activity such as aerobics,

walking clubs or baseball.[12] Walking to work or bicycling for leisure would not be included

as a sporting activity. Survey studies have shown that 49% of Canadian adults over the age of 20

walk at least 30 minutes per day but only 34% of them report that they participate in sport.[12]

Therefore, a Canadian participant may answer that the sporting activity item is not applicable

simply because they have not attempted an organized physical activity instead of answering how

it affects their daily physical activity. While there may not have been time to attempt

participation in organized sports in acute WAD, participation in everyday physical activities

would likely be attempted. However, these participants may not answer the sporting activity

item based on the wording of the item. While this may differ in other countries or regions, we

88

suspect that this item had the most missing values in our study because information on daily

activities was not considered when answering the item.

The sporting items should be modified, removed or substituted and the modified WDQ tested

both qualitatively to determine the appropriateness of the item and quantitatively to determine if

the measurement properties continue to hold or improve with the modified WDQ compared to

the original.

5.4.2 Minimizing measurement error

The WDQ requires a minimal change of 22 points before we can consider real change occurring

beyond the normal variations of a stable individual with WAD. This is a change of

approximately one sixth of the total score. This minimal detectable change coincides with the

minimal clinically important change and change beyond normal variations can therefore be

considered important as well. However, future research should examine how to decrease the

measurement error of the WDQ so that smaller changes in WAD disability can be detected.

There may be two potential ways to improve measurement error: 1. modify the WDQ, and 2.

modify WDQ scoring. One manner to improve it may be to examine the items of the WDQ

conceptually and determine if the sporting item (with the most missing values) and concentration

items (with complex loading in the three-factor model) should be modified or substituted with

other items or if more life participation items should be added. Another method may be to

examine the scoring of the WDQ. The WDQ may need to have weighted scoring or the

individual item scoring may need to be modified if a shorter item scale is found to be more

sensitive.

5.4.3 Longitudinal and structural construct validity

Our sample included 130 participants who were assessed within 21 days of their motor vehicle

collision. Although this sample size is adequate for the assessment of WDQ measurement

properties, a larger sample may provide more accurate answers. Specifically, confirmatory factor

analysis is needed to confirm the 2-factor structure of the WDQ. Our sample was too small to be

split into two smaller sub-samples to perform exploratory and confirmatory factor analysis.

However, the UHN Whiplash Intervention Trial (WIT) which had similar inclusion and

exclusion criteria to this thesis’ cohort study and which shares a significant proportion of

participants can be used to perform such an analysis.[26] The UHN WIT can provide WDQ

89

baseline data on 340 participants. This sample size would be adequate for splitting to perform

both a CFA and to perform another EFA to determine if issues encountered in our study such as

missing values for the sporting activities item or complex loading of the concentration item in

the factor analysis would be relevant in a larger population or if a larger population could

potentially yield a 3-factor structure that would reflect domains of the ICF more closely.

The larger sample size would also be useful for retesting the emotional subscale responsiveness

over 6 weeks. However, the emotional subscale may need to be refined with modification or

inclusion of more emotional items to improve responsiveness over time. Modifications to the

WDQ would have to be tested in a separate study since the UHN WIT will only have the

original, unmodified WDQ.

5.4.4 Predictive validity

While the WDQ was developed with an evaluative purpose, it may have predictive properties

that could be helpful clinically in preventing chronic symptoms. The predictive utility of the

WDQ should be evaluated for identifying patients at risk for developing chronic WAD in a

similar population to determine if specific items, subscales or the total WDQ score can predict

cases that may require modifications in management. If the WDQ (including its items or

subscales) has a good discriminant validity at predicting cases that recover, then the instrument

would be useful to identify patients at risk of developing chronic disability. Furthermore, the

WDQ should be tested for inclusion in a clinical prediction rule that includes other prognostic

factors such as demographic or clinical factors.

5.4.5 Direct comparison with other relevant instruments The measurement properties of the WDQ should be directly compared to other outcome

measures used in WAD to determine if the WDQ outperforms other instruments. The sample

size should be large enough to do sub-analyses based on chronicity since measurement properties

can differ based on time of administration.

5.4.6 Applicability of the conceptual framework

We have developed a conceptual framework to assist in the use of instruments relevant to

measuring clinical outcomes. Our framework provides a basis for differences in development

and consistency in evaluation of measurement properties between clinimetrics and

90

psychometrics. Once the framework is applied, it will be tested in terms of its applicability to

the field of measurement. The application of the framework will help determine if it needs

further modifications.

Other measurement fields may follow a similar pattern of conceptual similarities and differences

with clinimetrics and psychometrics but, to our knowledge, this has not been studied. Future

research should determine if other measurement fields that could be relevant to clinical

measurement, such as biometrics, can fit into a similar framework or if the conceptual

differences may demonstrate a different development and evaluation framework pattern.

91

References

[1] Angst F, Aeschlimann A, Steiner W, Stucki G. Responsiveness of the WOMAC

osteoarthritis index as compared with the SF-36 in patients with osteoarthritis of the legs

undergoing a comprehensive rehabilitation intervention. Ann Rheum Dis 2001;60:834-840.

[2] Apgar V. A proposal for a new method of evaluation of the newborn infant. Curr Res Anesth

Analg 1953;32:260-267.

[3] Arksey H, O'Malley L. Scoping Studies: Towards a Methodological Framework.

International Journal of Social Research Methodology: Theory & Practice 2005;8:19-32.

[4] Armitage P, David HA. Advances in biometry : 50 years of the International Biometric

Society. New York: Wiley, 1996.

[5] Beaton DE, Tarasuk V, Katz JN, Wright JG, Bombardier C. "Are you better?" A qualitative

study of the meaning of recovery.see comment. Arthritis Rheum 2001;45:270-279.

[6] Beaton DE, Wright JG, Katz JN, Upper Extremity Collaborative G. Development of the

QuickDASH: comparison of three item-reduction approaches. J Bone Joint Surg Am

2005;87:1038-1046.

[7] Berthier F, Potel G, Leconte P, Touze MD, Baron D. Comparative study of methods of

measuring acute pain intensity in an ED. Am J Emerg Med 1998;16:132-136.

[8] Bijur PE, Latimer CT, Gallagher EJ. Validation of a verbally administered numerical rating

scale of acute pain for use in the emergency department. Acad Emerg Med 2003;10:390-392.

[9] Bolton JE. Sensitivity and specificity of outcome measures in patients with neck pain:

detecting clinically significant improvement. Spine 2004;29:2410-2417.

[10] Bolton JE, Humphreys BK. The Bournemouth Questionnaire: a short-form comprehensive

outcome measure. II. Psychometric properties in neck pain patients. J Manipulative Physiol Ther

2002;25:141-148.

92

[11] Bonett DG. Sample size requirements for estimating intraclass correlations with desired

precision. Stat Med 2002;21:1331-1335.

[12] Canadian Fitness and Lifestyle Research Institute. Section A: Physical Activity and Sport

Participation Rates in Canada. Statistics Canada 2008;2002/2003 Canadian Community Health

Survey.

[13] Carroll LJ, Cassidy JD, Côté P. Frequency, timing, and course of depressive

symptomatology after whiplash. Spine 2006;31:E551-6.

[14] Carroll LJ, Cassidy JD, Côté P. The role of pain coping strategies in prognosis after

whiplash injury: passive coping predicts slowed recovery.see comment. Pain 2006;124:18-26.

[15] Carroll LJ, Holm LW, Hogg-Johnson S, Côté P, Cassidy JD, Haldeman S, Nordin M,

Hurwitz EL, Carragee EJ, van der Velde G, Peloso PM, Guzman J, Bone and Joint Decade 2000-

2010 Task Force on Neck Pain and Its Associated,Disorders. Course and prognostic factors for

neck pain in whiplash-associated disorders (WAD): results of the Bone and Joint Decade 2000-

2010 Task Force on Neck Pain and Its Associated Disorders. Spine 2008;33:S83-92.

[16] Carroll LJ, Jones DC, Ozegovic D, Cassidy JD. How well are you recovering? The

association between a simple question about recovery and patient reports of pain intensity and

pain disability in whiplash-associated disorders. Disabil Rehabil 2012;34:45-52.

[17] Carstensen TB, Frostholm L, Oernboel E, Kongsted A, Kasch H, Jensen TS, Fink P. Post-

trauma ratings of pre-collision pain and psychological distress predict poor outcome following

acute whiplash trauma: a 12-month follow-up study. Pain 2008;139:248-259.

[18] Cassidy JD, Carroll LJ, Côté P, Frank J. Does multidisciplinary rehabilitation benefit

whiplash recovery?: results of a population-based incidence cohort study. Spine 2007;32:126-

131.

[19] Cassidy JD, Carroll LJ, Côté P, Lemstra M, Berglund A, Nygren A. Effect of eliminating

compensation for pain and suffering on the outcome of insurance claims for whiplash injury. N

Engl J Med 2000;342:1179-1186.

93

[20] Chappuis G, Soltermann B, Cea, Aredoc, Ceredoc. Number and cost of claims linked to

minor cervical trauma in Europe: results from the comparative study by CEA, AREDOC and

CEREDOC. Eur Spine J 2008;17:1350-1357.

[21] Childs JD, Piva SR, Fritz JM. Responsiveness of the numeric pain rating scale in patients

with low back pain. Spine 2005;30:1331-1334.

[22] Cleland JA, Childs JD, Whitman JM. Psychometric properties of the Neck Disability Index

and Numeric Pain Rating Scale in patients with mechanical neck pain. Arch Phys Med Rehabil

2008;89:69-74.

[23] Cohen J. Statistical power analysis for the behavioral sciences. Hillsdale, NJ: L. Erlbaum

Associates, 1988.

[24] Copay AG, Subach BR, Glassman SD, Polly DW,Jr, Schuler TC. Understanding the

minimum clinically important difference: a review of concepts and methods. Spine J 2007;7:541-

546.

[25] Côté P, Cassidy JD. The epidemiology of neck pain: what we have learned from our

population-based studies. Journal of the Canadian Chiropractic Association 2003;47:284-290.

[26] Côté P, Cassidy JD, Carette S, Boyle E, Shearer HM, Stupar M, Ammendolia C, van der

Velde G, Hayden JA, Yang X, van Tulder M, Frank JW. Protocol of a randomized controlled

trial of the effectiveness of physician education and activation versus two rehabilitation programs

for the treatment of Whiplash-associated Disorders: The University Health Network Whiplash

Intervention Trial. Trials 2008;9:75.

[27] Côté P, Cassidy JD, Carroll LJ. The factors associated with neck pain and its related

disability in the Saskatchewan population. Spine 2000;25:1109-1117.

[28] Côté P, Cassidy JD, Carroll LJ, Frank JW, Bombardier C. A systematic review of the

prognosis of acute whiplash and a new conceptual framework to synthesize the literature. Spine

2001;26:E445-58.

94

[29] Côté P, Hogg-Johnson S, Cassidy JD, Carroll LJ, Frank JW, Bombardier C. Initial patterns

of clinical care and recovery from whiplash injuries: a population-based cohort study.see

comment. Arch Intern Med 2005;165:2257-2263.

[30] Côté P, Soklaridis S. Does early management of whiplash-associated disorders assist or

impede recovery? Spine 2011;36:S275-9.

[31] de Vet HCW, Terwee CB, Bouter LM. Clinimetrics and psychometrics: two sides of the

same coin. J Clin Epidemiol 2003;56:1146-1147.

[32] de Vet HCW, Terwee CB, Bouter LM. Current challenges in clinimetrics. J Clin Epidemiol

2003;56:1137-1141.

[33] de Vet HCW, Terwee CB, Mokkink LB, Knol DL. Measurement in Medicine: A Practical

Guide. New York, U.S.A.: Cambridge University Press, 2011.

[34] Deyo RA, Centro RM. Assessing the responsiveness of functional scales to clinical change:

An analogy to diagnostic test performance. J Chronic Dis 1986;39:897-906.

[35] Deyo RA, Diehr P, Patrick DL. Reproducibility and responsiveness of health status

measures. Statistics and strategies for evaluation. Control Clin Trials 1991;12:142S-158S.

[36] Dijkers MP. Psychometrics and clinimetrics in assessing environments. A comment

suggested by Mackenzie et al., 2002. J Allied Health 2003;32:38-43.

[37] Dworkin RH, Turk DC, Wyrwich KW, Beaton D, Cleeland CS, Farrar JT, Haythornthwaite

JA, Jensen MP, Kerns RD, Ader DN, Brandenburg N, Burke LB, Cella D, Chandler J, Cowan P,

Dimitrova R, Dionne R, Hertz S, Jadad AR, Katz NP, Kehlet H, Kramer LD, Manning DC,

McCormick C, McDermott MP, McQuay HJ, Patel S, Porter L, Quessy S, Rappaport BA,

Rauschkolb C, Revicki DA, Rothman M, Schmader KE, Stacey BR, Stauffer JW, von Stein T,

White RE, Witter J, Zavisic S. Interpreting the clinical importance of treatment outcomes in

chronic pain clinical trials: IMMPACT recommendations. Journal of Pain 2008;9:105-121.

[38] Dziuban CD, Shirkey EC. When is a correlation matrix appropriate for factor analysis?

Some decision rules. Psychol Bull 1974;81:358-361.

95

[39] Emmelkamp PM. The additional value of clinimetrics needs to be established rather than

assumed. Psychother Psychosom 2004;73:142-144.

[40] Fabrigar LR, Wegener DT, MacCallum RC, Strahan EJ. Evaluating the use of exploratory

factor analysis in psychological research. Psychol Methods 1999;4:272-299.

[41] Fava GA, Belaise C. A discussion on the role of clinimetrics and the misleading effects of

psychometric theory. J Clin Epidemiol 2005;58:753-756.

[42] Fava GA, Ruini C, Rafanelli C. Psychometric theory is an obstacle to the progress of

clinical research. Psychother Psychosom 2004;73:145-148.

[43] Fava GA, Tomba E, Sonino N. Clinimetrics: the science of clinical measurements. Int J

Clin Pract 2012;66:11-15.

[44] Fayers PM, Hand DJ. Factor analysis, causal indicators and quality of life. Quality of Life

Research: An International Journal of Quality of Life Aspects of Treatment, Care &

Rehabilitation 1997;6:139-150.

[45] Fayers PM, Hand DJ, Bjordal K, Groenvold M. Causal indicators in quality of life research.

Quality of Life Research: An International Journal of Quality of Life Aspects of Treatment, Care

& Rehabilitation 1997;6:393-406.

[46] Feinstein AR. T. Duckett Jones Memorial Lecture. The Jones criteria and the challenges of

clinimetrics. Circulation 1982;66:1-5.

[47] Feinstein AR. Clinimetrics. New Haven: Yale University Press, 1987.

[48] Feinstein AR. Multi-item "instruments" vs Virginia Apgar's principles of clinimetrics. Arch

Intern Med 1999;159:125-128.

[49] Ferrari R, Russell A, Kelly AJ. Assessing whiplash recovery--the Whiplash Disability

Questionnaire. Aust Fam Physician 2006;35:653-654.

[50] Floyd FJ, Widaman KF. Factor analysis in the development and refinement of clinical

assessment instruments. Psychol Assess 1995;7:286-299.

96

[51] Gorsuch RL. Using Bartlett's Significance Test to determine the number of factors to

extract. Educational and Psychological Measurement 1973;33:361-364.

[52] Gray JAM. Evidence-based healthcare. New York: Churchill Livingston, 2001.

[53] Gross A, Forget M, St George K, Fraser MM, Graham N, Perry L, Burnie SJ, Goldsmith

CH, Haines T, Brunarski D. Patient education for neck pain. Cochrane Database Syst Rev

2012;3:005106.

[54] Guyatt G, Walter S, Norman G. Measuring change over time: assessing the usefulness of

evaluative instruments. J Chronic Dis 1987;40:171-178.

[55] Guyatt GH, Bombardier C, Tugwell PX. Measuring disease-specific quality of life in

clinical trials. CMAJ 1986;134:889-895.

[56] Hains F, Waalen J, Mior S. Psychometric properties of the neck disability index. Journal of

Manipulative & Physiological Therapeutics 1998;21:75-80.

[57] Hamilton M. The assessment of anxiety states by rating. Br J Med Psychol 1959;32:50-55.

[58] Hatcher L. A step-by-step approach to using the SAS system for factor analysis and

structural equation modeling. Cary, NC: SAS Institute, 1994.

[59] Hincapié CA, Cassidy JD, Côté P, Carroll LJ, Guzman J. Pain localization after traffic

collisions: analysis of a population-based inception cohort study. In: Anonymous World

Congress on Neck Pain, 2008. pp. 100.

[60] Hjermstad M, Fayers P, Haugen D, Caraceni A, Hanks G, Loge J, Fainsinger R, Aass N,

Kaasa S, European Palliative Care Research Collaborative (EPCRC). Studies comparing

Numerical Rating Scales, Verbal Rating Scales, and Visual Analogue Scales for assessment of

pain intensity in adults: a systematic literature review. J Pain Symptom Manage 2011;41:1073-

1093.

[61] Holm LW, Carroll LJ, Cassidy JD, Hogg-Johnson S, Côté P, Guzman J, Peloso P, Nordin

M, Hurwitz E, van der Velde G, Carragee E, Haldeman S, Bone and Joint Decade 2000-2010

Task Force on Neck Pain and Its Associated,Disorders. The burden and determinants of neck

97

pain in whiplash-associated disorders after traffic collisions: results of the Bone and Joint

Decade 2000-2010 Task Force on Neck Pain and Its Associated Disorders. Spine 2008;33:S52-9.

[62] Holm LW, Carroll LJ, Cassidy JD, Skillgate E, Ahlbom A. Expectations for recovery

important in the prognosis of whiplash injuries. PLoS Med 2008;5:e105.

[63] Hoving JL, O'Leary EF, Niere KR, Green S, Buchbinder R. Validity of the neck disability

index, Northwick Park neck pain questionnaire, and problem elicitation technique for measuring

disability associated with whiplash-associated disorders. Pain 2003;102:273-281.

[64] Hudak P, Amadio PC, Bombardier C, Beaton DE, Cole D, Davis AM, Hawker GA, Katz

JN, Makela M, Marx RG, Punnett L, Wright JG. Development of an upper extremity outcome

measure: The DASH disabilities of the arm, shoulder, and head. Am J Ind Med 1996;29:602-

608.

[65] Huntington JL, Dueck A. Handling missing data. Curr Probl Cancer 2005;29:317-325.

[66] Hurst H, Bolton J. Assessing the clinical significance of change scores recorded on

subjective outcome measures. J Manipulative Physiol Ther 2004;27:26-35.

[67] Hurwitz EL, Carragee EJ, van der Velde G, Carroll LJ, Nordin M, Guzman J, Peloso PM,

Holm LW, Côté P, Hogg-Johnson S, Cassidy JD, Haldeman S, Bone and Joint Decade 2000-

2010 Task Force on Neck Pain and Its Associated,Disorders. Treatment of neck pain:

noninvasive interventions: results of the Bone and Joint Decade 2000-2010 Task Force on Neck

Pain and Its Associated Disorders. Spine 2008;33:S123-52.

[68] Husted JA, Cook RJ, Farewell VT, Gladman DD. Methods for assessing responsiveness: a

critical review and recommendations. J Clin Epidemiol 2000;53:459-468.

[69] Ifedi F. Sport participation in Canada, 2005. [S.l.]: Culture, 2008.

[70] Jette DU, Jette AM. Physical therapy and health outcomes in patients with spinal

impairments.see commenterratum appears in Phys Ther 1997 Jan;77(1):113. Phys Ther

1996;76:930-941.

98

[71] Jull GA, Soderlund A, Stemper BD, Kenardy J, Gross AR, Cote P, Treleaven J, Bogduk N,

Sterling M, Curatolo M. Toward optimal early management after whiplash injury to lessen the

rate of transition to chronicity: discussion paper 5. Spine 2011;36:S335-42.

[72] Juniper EF, Guyatt GH, Feeny DH, Ferrie PJ, Griffith LE, Townsend M. Measuring quality

of life in children with asthma. Qual Life Res 1996;5:35-46.

[73] Juniper EF, Guyatt GH, Streiner DL, King DR. Clinical impact versus factor analysis for

quality of life questionnaire construction. J Clin Epidemiol 1997;50:233-238.

[74] Juniper EF, Guyatt GH, Willan A, Griffith LE. Determining a minimal important change in

a disease-specific Quality of Life Questionnaire. J Clin Epidemiol 1994;47:81-87.

[75] Kaiser HF. A revised measure of sampling adequacy for factor-analytic data matrices.

Educational and Psychological Measurement 1981;41:379-381.

[76] Katz JN, Liang MH. Classification criteria revisited. Arthritis Rheum 1991;34:1228-1230.

[77] King JT,Jr, Roberts MS. Validity and reliability of the Short Form-36 in cervical

spondylotic myelopathy. J Neurosurg 2002;97:180-185.

[78] Kirshner B, Guyatt G. A methodological framework for assessing health indices. J Chronic

Dis 1985;38:27-36.

[79] Kline RB. Principles and practice of structural equation modeling. New York: Guilford

Press, 2005.

[80] Kosinski M, Keller SD, Hatoum HT, Kong SX, Ware JE. The SF-36 Health Survey as a

generic outcome measure in clinical trials of patients with osteoarthritis and rheumatoid arthritis:

tests of data quality, scaling assumptions and score reliability. Med Care 1999;37:MS10-22.

[81] Kraemer HC, Korner AF. Statistical alternatives in assessing reliability, consistency, and

individual differences for quantitative measures: Application to behavioral measures of neonates.

Psychol Bull 1976;83:914-921.

[82] Levac D, Colquhoun H, O'Brien KK. Scoping studies: advancing the methodology.

Implement Sci 2010;5:69.

99

[83] Lohman TG, Roche AF, Martorell R. Anthropometric standardization reference manual.

Champaign, IL: Human Kinetics Books, 1988.

[84] MacCallum RC, Widaman KF, Zhang S, Hong S. Sample Size in Factor Analysis. Psychol

Methods 1999;4:84-99.

[85] Marx RG, Bombardier C, Hogg-Johnson S, Wright JG. Clinimetric and psychometric

strategies for development of a health measurement scale. J Clin Epidemiol 1999;52:105-111.

[86] Mason JH, Anderson JJ, Meenan RF, Haralson KM, Lewis-Stevens D, Kaine JL. The rapid

assessment of disease activity in rheumatology (radar) questionnaire. Validity and sensitivity to

change of a patient self-report measure of joint count and clinical status. Arthritis Rheum

1992;35:156-162.

[87] McCaskey M, Ettlin T, Schuster C. German version of the whiplash disability

questionnaire: reproducibility and responsiveness. Health Qual Life Outcomes 2013;11:36.

[88] McHorney CA, Ware JE,Jr, Lu JF, Sherbourne CD. The MOS 36-item Short-Form Health

Survey (SF-36): III. Tests of data quality, scaling assumptions, and reliability across diverse

patient groups. Med Care 1994;32:40-66.

[89] Mokkink LB, Terwee CB, Patrick DL, Alonso J, Stratford PW, Knol DL, Bouter LM, de

Vet HCW. The COSMIN study reached international consensus on taxonomy, terminology, and

definitions of measurement properties for health-related patient-reported outcomes. J Clin

Epidemiol 2010;63:737-745.

[90] Ngo T, Stupar M, Côté P, Boyle E, Shearer H. A study of the test-retest reliability of the

self-perceived general recovery and self-perceived change in neck pain questions in patients with

recent whiplash-associated disorders. Eur Spine J 2010;19:957-962.

[91] Nierenberg AA, Sonino N. From clinical observations to clinimetrics: a tribute to Alvan R.

Feinstein, MD. Psychother Psychosom 2004;73:131-133.

[92] Nunnally JC, Bernstein IH. Psychometric theory. New York: McGraw-Hill, Inc., 1994.

100

[93] Ottosson C, Pettersson H, Johansson S-, Nyren O, Ponzer S. Recovery after minor traffic

injuries: A randomized controlled trial. PLoS Clinical Trials.Vol 2007;2:Arte Number: e14. ate

of Pubaton: 23 MAR 2007.

[94] Ozegovic D, Carroll LJ, Cassidy JD. Factors associated with recovery expectations

following vehicle collision: a population-based study. J Rehabil Med 2010;42:66-73.

[95] Ozegovic D, Carroll LJ, David Cassidy J. Does expecting mean achieving? The association

between expecting to return to work and recovery in whiplash associated disorders: a population-

based prospective cohort study. Eur Spine J 2009;18:893-899.

[96] Pengel LH, Refshauge KM, Maher CG. Responsiveness of pain, disability, and physical

impairment outcomes in patients with low back pain. Spine 2004;29:879-883.

[97] Phillips LA, Carroll LJ, Cassidy JD, Cote P. Whiplash-associated disorders: who gets

depressed? Who stays depressed?. Eur Spine J 2010;19:945-956.

[98] Pietrobon R, Coeytaux R, Carey T, Richardson W, DeVellis R. Standard Scales for

Measurement of Functional Outcome for Cervical Pain or Dysfunction: A Systematic Review.

Spine 2002;27:515-522.

[99] Pinfold M, Niere KR, O'Leary EF, Hoving JL, Green S, Buchbinder R. Validity and

internal consistency of a whiplash-specific disability measure. Spine 2004;29:263-268.

[100] Portney LG. Foundations of clinical research : applications to practice. Upper Saddle

River, N.J.: Prentice Hall Health, 2000.

[101] Quinlan KP, Annest JL, Myers B, Ryan G, Hill H. Neck strains and sprains among motor

vehicle occupants-United States, 2000. Accident Analysis & Prevention 2004;36:21-27.

[102] Radloff LS. The CES-D Scale: A self-report depression scale for research in the general

population. Applied Psychological Measurement 1977;1:385-401.

[103] Rebbeck TJ, Refshauge KM, Maher CG, Stewart M. Evaluation of the core outcome

measure in whiplash. Spine 2007;32:696-702.

101

[104] Relman AS. Assessment and accountability: the third revolution in medical care. N Engl J

Med 1988;319:1220-1222.

[105] Revicki DA, Rentz AM, Luo MP, Wong RL. Psychometric characteristics of the short

form 36 health survey and functional assessment of chronic illness Therapy-Fatigue subscale for

patients with ankylosing spondylitis. Health Qual Life Outcomes 2011;9:36.

[106] Ribera A, Permanyer-Miralda G, Alonso J, Cascant P, Soriano N, Brotons C. Is

psychometric scoring of the McNew Quality of Life after Myocardial Infarction questionnaire

superior to the clinimetric scoring? A comparison of the two approaches. Qual Life Res

2006;15:357-365.

[107] Richter M, Ferrari R, Otte D, Kuensebeck HW, Blauth M, Krettek C. Correlation of

clinical findings, collision parameters, and psychological factors in the outcome of whiplash

associated disorders. J Neurol Neurosurg Psychiatry 2004;75:758-764.

[108] Salaffi F, Carotti M, Grassi W. Health-related quality of life in patients with hip or knee

osteoarthritis: comparison of generic and disease-specific instruments. Clin Rheumatol

2005;24:29-37.

[109] Schumacker RE. A beginner's guide to structural equation modeling. Mahwah, N.J.:

Lawrence Erlbaum Associates, 2004.

[110] Schuster C, McCaskey M, Ettlin T. German translation, cross-cultural adaptation and

validation of the whiplash disability questionnaire. Health Qual Life Outcomes 2013;11:45.

[111] Shrout PE, Fleiss JL. Intraclass correlations: Uses in assessing rater reliability. Psychol

Bull 1979;86:420-428.

[112] Snaith RP, Baugh SJ, Clayden AD, Husain A, Sipple MA. The Clinical Anxiety Scale: an

instrument derived from the Hamilton Anxiety Scale. Br J Psychiatry 1982;141:518-523.

[113] Spitzer WO, Skovron ML, Salmi LR, Cassidy JD, Duranceau J, Suissa S, Zeiss E.

Scientific monograph of the Quebec Task Force on Whiplash-Associated Disorders: redefining

"whiplash" and its management. Spine 1995;20:1S-73S.

102

[114] Sterling M, Carroll LJ, Kasch H, Kamper SJ, Stemper B. Prognosis after whiplash injury:

where to from here? Discussion paper 4. Spine 2011;36:S330-4.

[115] Stewart M, Maher CG, Refshauge KM, Bogduk N, Nicholas M. Responsiveness of pain

and disability measures for chronic whiplash. Spine 2007;32:580-585.

[116] Stratford PW, Riddle DL, Binkley JM, Spadoni G, Westaway MD, Padfield B. Using the

Neck Disability Index to make decisions concerning individual patients. Physiotherapy Canada

1999;51:107-112.

[117] Streiner DL. Being inconsistent about consistency: when coefficient alpha does and

doesn't matter. J Pers Assess 2003;80:217-222.

[118] Streiner DL. Clinimetrics vs. psychometrics: an unnecessary distinction. J Clin Epidemiol

2003;56:1142-1145.

[119] Streiner DL. Health measurement scales : a practical guide to their development and use.

Toronto: Oxford University Press, 2003.

[120] Streiner DL. Starting at the beginning: an introduction to coefficient alpha and internal

consistency. J Pers Assess 2003;80:99-103.

[121] Stucki G. International Classification of Functioning, Disability, and Health (ICF): a

promising framework and classification for rehabilitation medicine. Am J Phys Med Rehabil

2005;84:733-740.

[122] Tavakol M, Dennick R. Making sense of Cronbach's alpha. International Journal of

Medical Education 2011;2:53-55.

[123] Terry R. Recent advances in measurement theory and the use of sociometric techniques.

New Dir Child Adolesc Dev 2000:27-53.

[124] Terwee CB, Jansma EP, Riphagen II, de Vet HCW. Development of a methodological

PubMed search filter for finding studies on measurement properties of measurement instruments.

Qual Life Res 2009;18:1115-1123.

103

[125] Terwee CB, Mokkink LB, Knol DL, Ostelo RW, Bouter LM, de Vet HC. Rating the

methodological quality in systematic reviews of studies on measurement properties: a scoring

system for the COSMIN checklist. Qual Life Res 2012;21:651-657.

[126] Terwee CB, Schellingerhout JM, Verhagen AP, Koes BW, de Vet HC. Methodological

quality of studies on the measurement properties of neck pain and disability questionnaires: a

systematic review. J Manipulative Physiol Ther 2011;34:261-272.

[127] Turner D, Griffiths AM, Steinhart AH, Otley AR, Beaton DE. Mathematical weighting of

a clinimetric index (Pediatric Ulcerative Colitis Activity Index) was superior to the judgmental

approach. J Clin Epidemiol 2009;62:738-744.

[128] U.S. Food and Drug Administration. Guidance for Industry and Food and Drug

Administration Staff - Factors to Consider When Making Benefit-Risk Determinations in

Medical Device Premarket Approvals and De Novo Classifications. 2011.

[129] van der Velde G, Beaton D, Hogg-Johnston S, Hurwitz E, Tennant A. Rasch analysis

provides new insights into the measurement properties of the neck disability index. Arthritis

Rheum 2009;61:544-551.

[130] Verhagen AP, Scholten-Peeters GG, van Wijngaarden S, de Bie RA, Bierma-Zeinstra SM.

Conservative treatments for whiplash. Cochrane Database Syst Rev 2007:003338.

[131] Vernon H, Mior S. The Neck Disability Index: a study of reliability and validity.erratum

appears in J Manipulative Physiol Ther 1992 Jan;15(1):followi. J Manipulative Physiol Ther

1991;14:409-415.

[132] Vernon H, Mior S. The Northwick Park Neck Pain Questionnaire, devised to measure

neck pain and disability.comment. Br J Rheumatol 1994;33:1203-1204.

[133] Walter SD, Eliasziw M, Donner A. Sample size and optimal designs for reliability studies.

Stat Med 1998;17:101-110.

[134] Walton D. A review of the definitions of 'recovery' used in prognostic studies on whiplash

using an ICF framework. Disabil Rehabil 2009;31:943-957.

104

[135] Walton DM, Pretty J, MacDermid JC, Teasell RW. Risk factors for persistent problems

following whiplash injury: results of a systematic review and meta-analysis. J Orthop Sports

Phys Ther 2009;39:334-350.

[136] Ware JE,Jr. SF-36 health survey update. Spine 2000;25:3130-3139.

[137] Ware JE,Jr, Sherbourne CD. The MOS 36-item short-form health survey (SF-36). I.

Conceptual framework and item selection. Med Care 1992;30:473-483.

[138] Weir JP. Quantifying test-retest reliability using the intraclass correlation coefficient and

the SEM. J Strength Cond Res 2005;19:231-240.

[139] Westaway MD, Stratford PW, Binkley JM. The patient-specific functional scale:

validation of its use in persons with neck dysfunction. Journal of Orthopaedic & Sports Physical

Therapy 1998;27:331-338.

[140] Willis C, Niere KR, Hoving JL, Green S, O'Leary EF, Buchbinder R. Reproducibility and

responsiveness of the Whiplash Disability Questionnaire. Pain 2004;110:681-688.

[141] Wolf SM. Quality assessment of ethics in health care: the accountability revolution. Am J

Law Med 1994;20:105-128.

[142] World Health Organization. International classification of functioning, disability and

health : ICF. Geneva, 2001.

[143] Wright JG, Feinstein AR. A comparative contrast of clinimetric and psychometric

methods for constructing indexes and rating scales. J Clin Epidemiol 1992;45:1201-1218.

[144] Wright JG, Young NL. A comparison of different indices of responsiveness. J Clin

Epidemiol 1997;50:239-246.

[145] Zigmond AS, Snaith RP. The hospital anxiety and depression scale. Acta Psychiatr Scand

1983;67:361-370.

[146] Zyzanski SJ, Perloff E. Clinimetrics and psychometrics work hand in hand. Arch Intern

Med 1999;159:1816-1817.

105

Appendices

Appendix 1: Questionnaires

A-1.1: Baseline Questionnaire

The University Health Network WDQ Validation Study

Baseline Questionnaire

STUDY NUMBER: _________________________ TODAY’S DATE: _______, _______, ________ (day) (month) (year)

Baseline v. Apr29.08 – Page 1

106

PLEASE PRINT ALL ANSWERS SECTION A: In this section, we will be asking you questions about your traffic

accident.

1. When did your traffic accident happen? Please provide the date of the accident:

Day___ Month___ Year 20____

SECTION B: In this section, we will be asking you a question about previous accidents and injuries that may have happened in the past 2 years. Please do not include your most recent accident and its related injuries when responding to this question.

1. Excluding any pain caused by the present accident, have you had any neck pain in the past

two years? No Yes

107

1. Do you have neck pain caused by your car accident?

No Skip to Question 2. Yes Please rate your average neck pain in the past 24 hours on a pain scale of 0 to 10

where 0 means no pain at all and 10 means pain as bad as it could be. Pain as bad No Pain as could be

0 1 2 3 4 5 6 7 8 9 10

2. Do you have shoulder pain caused by your car accident? No Skip to Question 3. Yes Please rate your average shoulder pain in the past 24 hours on a pain scale of 0 to

10 where 0 means no pain at all and 10 means pain as bad as it could be. Pain as bad No Pain as could be

0 1 2 3 4 5 6 7 8 9 10

3. Do you have low back pain caused by your car accident? No Skip to Question 4. Yes Please rate your average low back pain in the past 24 hours on a pain scale of 0

to 10 where 0 means no pain at all and 10 means pain as bad as it could be. Pain as bad No Pain as could be

0 1 2 3 4 5 6 7 8 9 10 4. Do you have a headache caused by your car accident?

No Skip to Question 5. Yes Please rate your average headache pain in the past 24 hours on a pain scale of 0

to 10 where 0 means no pain at all and 10 means pain as bad as it could be. Pain as bad No Pain as could be

0 1 2 3 4 5 6 7 8 9 10

5. Other parts of your body?

a. Do you have pain in your arm(s) caused by your car accident? No Skip to Question 5b.

Yes Please rate your average arm pain in the past 24 hours on a pain scale of 0 to 10 where 0 means no pain at all and 10 means pain as bad as it could be.

Pain as bad No Pain as could be

0 1 2 3 4 5 6 7 8 9 10 b. Do you have pain in your hand(s) caused by your car accident?

No Skip to Question 5c. Yes Please rate your average hand pain in the past 24 hours on a pain scale of 0

to 10 where 0 means no pain at all and 10 means pain as bad as it could be. Pain as bad

No Pain as could be 0 1 2 3 4 5 6 7 8 9 10

SECTION C: In this section, we will be asking questions about your pain and its intensity as caused by your accident-related injuries.

108

c. Do you have pain in your face caused by your car accident?

No Skip to Question 5d. Yes Please rate your average face pain in the past 24 hours on a pain scale of 0 to

10 where 0 means no pain at all and 10 means pain as bad as it could be. Pain as bad


d. Do you have pain in your leg(s) caused by your car accident?

No Skip to Question 5e. Yes Please rate your average leg pain in the past 24 hours on a pain scale of 0 to

10 where 0 means no pain at all and 10 means pain as bad as it could be. Pain as bad No Pain as could be

0 1 2 3 4 5 6 7 8 9 10

e. Do you have pain in your foot/feet caused by your car accident? No Skip to Question 5f.

Yes Please rate your average foot pain in the past 24 hours on a pain scale of 0 to 10 where 0 means no pain at all and 10 means pain as bad as it could be.


0 1 2 3 4 5 6 7 8 9 10

f. Do you have pain in your mid back caused by your car accident? No Skip to Question 5g.

Yes Please rate your average mid back pain in the past 24 hours on a pain scale of 0 to 10 where 0 means no pain at all and 10 means pain as bad as it could be.


0 1 2 3 4 5 6 7 8 9 10

g. Do you have pain in your abdomen, chest, or groin caused by your car accident? No Skip to Question 6.

Yes Please rate your average abdomen/chest/groin pain in the past 24 hours on a pain scale of 0 to 10 where 0 means no pain at all and 10 means pain as bad as it could be.


0 1 2 3 4 5 6 7 8 9 10

109

6. Did the accident cause any of the following symptoms? (check any that apply) Anxiety or worry Anger Concentration or attention problems Difficulty moving your neck Dizziness or unsteadiness Feeling of numbness, tingling or pain in arms or hands Feeling of numbness, tingling or pain in legs or feet Hearing problems Memory problems or forgetfulness Pain when your neck is moved Sleep problems Sore jaw Unusual fatigue or tiredness Vision problems

110

SECTION D: In this section, we will be asking you questions about past and current health issues that are unrelated to your recent accident.

1. In general, would you say your health is:

Excellent Very Good Good Fair Poor

2. Compared to one week ago, how would you rate your health in general now?

Much better than one week

ago

Somewhat better now

than one week ago

About the same as one

week ago

Somewhat worse now

than one week ago

Much worse now than one

week ago

3. The following questions are about activities you might do during a typical day. Does

your health now limit you in these activities? If so, how much?

Yes, limited

a lot

Yes, limited a little

No, not limited at all

a Vigorous activities, such as running, lifting heavy objects, participating in strenuous sports

b Moderate activities, such as moving a table, pushing a vacuum cleaner, bowling, or playing golf

c Lifting or carrying groceries d Climbing several flights of stairs e Climbing one flight of stairs f Bending, kneeling, or stopping g Walking more than a mile h Walking several hundred yards i Walking one hundred yards j Bathing or dressing yourself

111

4. During the past week, how much of the time have you had any of the following problems with your work or other regular daily activities as a result of your physical health?

All of

the time

Most of the time

Some of the time

A little of the time

None of the time

a Cut down on the amount of time you spent on work or other activities

b Accomplished less than you would like

c Were limited in the kind of work or other activities

d Had difficulty performing the work or other activities (for example, it took extra effort)

5. During the past week, how much of the time have you had any of the following

problems with your work or other regular daily activities as a result of any emotional problems (such as feeling depressed or anxious)?

6. During the past week, to what extent has your physical health or emotional problems

interfered with your normal social activities with family, friends, neighbours or groups?

Not at all Slightly Moderately Quite a bit Extremely

7. How much bodily pain have you had during the past week?

None Very mild Mild Moderate Severe Very severe

8. During the past week, how much did pain interfere with your normal work (including

both work outside the home and housework)?


All of the

time

Most of the time

Some of the time


None of the time


b Accomplished less than you would like

c Did work or activities less carefully than usual

112

9. These questions are about how you have felt and how things have been with you

during the past week. For each question, please give the one answer that comes closest to the way you have been feeling.

How much of the time during the past week…

All of

the time

Most of the time

Some of the time


None of the time

a Did you feel full of life? b Have you been very nervous?

c Have you felt so down in the dumps that nothing could cheer you up?

d Have you felt calm and peaceful? e Did you have a lot of energy?

f Have you felt downhearted and depressed?

g Did you feel worn out? h Have you been happy? i Did you feel tired?

10. During the past week, how much of the time has your physical health or emotional

problems interfered with your social activities (like visiting friends, relatives, etc.)?

All of the time Most of the time

Some of the time

A little of the time None of the time

11. How TRUE or FALSE is each of the following statements for you?

Definitely true

Mostly true

Don’t know

Mostly false

Definitely false

a I seem to get sick a little easier than other people

b I am as healthy as anybody I know

c I expect my health to get worse d My health is excellent

113

SECTION E: In this section, we will be asking you a question about your

expectations of recovery.

1. Do you think that your injury will…

get better soon get better slowly never get better don’t know

114

7. How well do you feel you are recovering from your injuries?

Completely better Much improved Slightly improved No change Slightly worse Much worse Worse than ever

8. How do you feel your neck pain has changed since the injury?

Very much better Better Slightly better No change Slightly worse Worse Very much worse

SECTION F: In this section, we will be asking you a question regarding how well you believe your recovery is progressing.

115

"SECTION G:

The questionnaire has been designed to give us information as to how your NECK PAIN has affected your ability to manage in everyday life. Please answer every question and mark in each section ONLY THE ONE BOX which applies to you. We realize you may consider that two of the statements in any one section relates to you, but PLEASE JUST MARK THE BOX WHICH MOST CLOSELY DESCRIBES YOUR PROBLEM. SECTION 1: Pain Intensity

I have no pain at the moment The pain is very mild at the moment The pain is moderate at the moment The pain is fairly severe at the moment The pain is very severe at the moment The pain is the worst imaginable at the moment

SECTION 2: Personal Care (Washing, Dressing etc.)

I can look after myself normally without causing extra pain I can look after myself normally but it causes extra pain It is painful to look after myself and I am slow and careful I need some help but manage most of my personal care I need help every day in most aspects of my personal care I do not get dressed, I wash with difficulty, and stay in bed

SECTION 3: Lifting

I can lift heavy weights without extra pain I can lift heavy weights but it causes extra pain Pain prevents me from lifting heavy weights off the floor but I can manage if they are conveniently positioned (e.g. on a table) Pain prevents me from lifting heavy weights, but I can manage light to medium weights if they are conveniently positioned I can lift only very light weights I cannot lift or carry anything at all

SECTION 4: Reading

I can read as much as I want with no pain in my neck I can read as much as I want with slight pain in my neck I can read as much as I want with moderate pain in my neck I cannot read as much as I want because of moderate pain in my neck I can hardly read at all because of severe pain in my neck I cannot read at all

116

SECTION 5: Headache I have no headaches at all I have slight headaches which occur infrequently I have moderate headaches which occur infrequently I have moderate headaches which occur frequently I have severe headaches which occur frequently I have headaches almost all the time

SECTION 6: Concentration

I can concecntrate fully when I want with no difficulty I can concentrate fully when I want to with slight difficulty I have a fair degree of difficulty in concentrating when I want to I have a lot of difficulty in concentrating when I want to I have a great deal of difficulty in concentrating when I want to I cannot concentrate at all

SECTION 7: Work

I can do as much work as I want to I can do my usual work, but no more I can do most of my usual work, but not all I cannot do my usual work I can hardly do any work at all I cannot do any work at all

SECTION 8: Driving

I can drive my car without any neck pain I can drive my car as long as I want with slight pain in my neck I can drive my car as long as I want with moderate pain in my neck I cannot drive my car as long as I want because of moderate pain in my neck I can hardly drive at all because of severe pain in my neck I cannot drive at all

SECTION 9: Sleeping

I have no trouble sleeping My sleep is slightly disturbed (less than 1 hour sleepless) My sleep is mildly disturbed (1-2 hours sleepless) My sleep is moderately disturbed (2-3 hours sleepless) My sleep is greatly disturbed (3-5 hours sleepless) My sleep is completely disturbed (5-7 hours sleepless)

SECTION 10: Recreation

I am able to engage in all my recreation activities with no neck pain at all I am able to engage in all my recreation activities with some neck pain I am able to engage in most, but not all, my recreation activities because of pain in my neck I am able to engage in few of my recreation activities because of pain in my neck I can hardly do any recreation activities because of pain in my neck I cannot do any recreation activities at all

117

The following scales have been designed to find out about your neck pain and how it’s affecting you. Please answer ALL the scales, and mark ONE number on EACH scale that best describes how you feel. 1. Over the past week, on average, how would you rate your neck pain? #!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.!/012! ! ! ! ! ! ! ! ! 3.456!7012!7.55189:!!2. Over the past week, how much has your neck pain interfered with your daily activities (housework,

washing, dressing, lifting, reading, driving)?!#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.!126:4;:4:2<:!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! =2089:!6.!<044>!.?6!0<61@16>!!3. Over the past week, how much has your neck pain interfered with your ability to take part in

recreational, social, and family activities?!!!#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.!126:4;:4:2<:!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! =2089:!6.!<044>!.?6!

0<61@16>!!4. Over the past week, how anxious (tense, uptight, irritable, difficulty in concentrating/relaxing) have

you been feeling? #!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.6!06!099!02A1.?5!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! BA64:C:9>!02A1.?5!

!5. Over the past week, how depressed (down-in-the-dumps, sad, in low spirits, pessimistic, unhappy)

have you been feeling?!#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.6!06!099!D:74:55:D!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! BA64:C:9>!D:74:55:D!!6. Over the past week, how have you felt your work (both inside and outside the home) has affected (or

would affect) your neck pain? !#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!E0@:!C0D:!16!2.!F.45:! ! ! ! ! !!!!!!!!!!!!!E0@:!C0D:!16!C?<G!

F.45:!!

7. Over the past week, how much have you been able to control (reduce/help) your neck pain on your own?!

!#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!H.C79:6:9>!<.264.9!16! ! ! ! ! ! ! ! -.!<.264.9!FG065.:@:4!

118

SECTION H: In this section, we will be asking you questions specifically about your daily activities and your feelings that may have been affected by your whiplash injuries.

Please circle a number in each section to indicate how you have been affected by the whiplash injury and symptoms. If one or more questions are not relevant to you, please leave that section blank.!!1. How much pain do you have today? #!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.!/012! ! ! ! ! ! ! ! ! 3.456!7012!IC0J12089:!%" How much do your whiplash symptoms interfere with your personal care (washing, dressing, etc)?!#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.!06!099!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!=2089:!6.!7:4;.4C!&" How much do your whiplash symptoms interfere with your work/home/study duties?!!!#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.!06!099!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!=2089:!6.!7:4;.4C!4. How much do your whiplash symptoms interfere with driving or using public transport? #!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.!06!099!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!=2089:!6.!640@:9!12!!!!!!!!!!!!

<04K?5:!7?891<!640257.46!

(" How much do your whiplash symptoms interfere with sleep?!#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.6!06!099!! ! ! ! ! ! ! ! ! H022.6!59::7!6. How tired/fatigued do you feel as a result of your whiplash injury/symptoms? !#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.6!06!099!! ! ! ! ! ! BA64:C:!!

614:D2:55K;061J?:!099!6G:!61C:!

*" How much do your whiplash symptoms interfere with social activity?!!#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.6!06!099!! ! ! ! ! ! ! ! ! =2089:!6.!5.<1091L:!8. How much do your whiplash symptoms interfere with sporting activities? !#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.6!06!099!! ! ! ! ! ! ! ! ! =2089:!6.!

/0461<1706:!!9. How much do your whiplash symptoms interfere with non-sporting leisure activity?

!#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.6!06!099!! ! ! ! ! ! ! ! ! =2089:!6.!

/0461<1706:!!

10. How much sadness/depression do you experience as a result of your whiplash injury/symptoms? !#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.2:!! ! ! ! ! ! ! ! BA64:C:!!

50D2:55KM:74:551.2!

119

!11. How much anger do you experience as a result of your whiplash injury/symptoms?

!#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.2:!! ! ! ! ! ! ! ! ! ! BA64:C:!02J:4!!12. How much anxiety do you experience as a result of your whiplash injury/symptoms? !#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.2:!! ! ! ! ! ! ! ! ! ! BA64:C:!02A1:6>!!13. How much difficulty do you have concentrating as a result of your whiplash injury/symptoms? #!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.!D1;;1<?96>!! ! ! ! ! ! ! ! ! =2089:!6.!! ! ! ! ! ! ! ! ! ! H.2<:26406:!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

120

SECTION I: In this section, we will be asking you questions about your mood.

Below is a list of some of the ways you may have felt or behaved. Please indicate how often you have felt this way during the past week by checking the appropriate space. Rarely or

none of the time (less than 1 day)

Some or a little of the time (1-2 days)

Occasionally or a moderate amount of time (3-4 days)

Most or all of the time (5-7 days)

1. I was bothered by things that usually don't bother me. 0 1 2 3

2. I did not feel like eating; my appetite was poor. 0 1 2 3

3. I felt that I could not shake off the blues even with help from my family or friends.

0 1 2 3

4. I felt that I was just as good as other people. 0 1 2 3

5. I had trouble keeping my mind on what I was doing. 0 1 2 3

6. I felt depressed. 0 1 2 3 7. I felt that everything I did was an effort. 0 1 2 3

8. I felt hopeful about the future. 0 1 2 3 9. I thought my life had been a failure. 0 1 2 3 10. I felt fearful. 0 1 2 3 11. My sleep was restless. 0 1 2 3 12. I was happy. 0 1 2 3 13. I talked less than usual. 0 1 2 3 14. I felt lonely. 0 1 2 3 15. People were unfriendly. 0 1 2 3 16. I enjoyed life. 0 1 2 3 17. I had crying spells. 0 1 2 3 18. I felt sad. 0 1 2 3 19. I felt that people disliked me. 0 1 2 3 20. I could not get "going." 0 1 2 3

121

SECTION J: In this last section, we would like to know a little about you.

1. Age: ___________ 2. Sex: Male Female 3. Marital Status:

Single, never married Living common-law Widowed Married Divorced Separated

4. Please check your highest level of education:

Grade 8 or less Higher than grade 8, but did not graduate from high school High school graduate Post secondary or some university Technical school graduate University graduate

5. What is your combined total family unit/household income per year?

$0 - $49,999 $50,000 - $59,999 $60,000 - $79,999 Above $80,000

6. Have you hired a lawyer or paralegal to help you with your claim?

No Yes

122

A-1.2: Three-to-Five Day Follow-up Questionnaire Three-day Follow-up Study ID: _____________

SECTION J: In this section, we will be asking you questions specifically about

your daily activities and your feelings that may have been affected by your whiplash injuries.

Please select a number to indicate how you have been affected by the whiplash injury and symptoms. If one or more questions are not relevant to you, please leave that section blank.!!1. How much pain do you have today? #!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.!/012! ! ! ! ! ! ! ! ! 3.456!7012!IC0J12089:!!%" How much do your whiplash symptoms interfere with your personal care (washing, dressing, etc)?!#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.!06!099!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! ! =2089:!6.!7:4;.4C!!&" How much do your whiplash symptoms interfere with your work/home/study duties?!!!#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.!06!099!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! ! =2089:!6.!7:4;.4C!!4. How much do your whiplash symptoms interfere with driving or using public transport? #!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.!06!099!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! ! =2089:!6.!640@:9!12!!!!!!!!!!!!

<04K?5:!7?891<!640257.46!

(" How much do your whiplash symptoms interfere with sleep?!#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.6!06!099!! ! ! ! ! ! ! ! ! H022.6!59::7!!6. How tired/fatigued do you feel as a result of your whiplash injury/symptoms? !#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.6!06!099!! ! ! ! ! ! BA64:C:!!

614:D2:55K;061J?:!!099!6G:!61C:!

!*" How much do your whiplash symptoms interfere with social activity?!!#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.6!06!099!! ! ! ! ! ! ! ! ! =2089:!6.!5.<1091L:!!8. How much do your whiplash symptoms interfere with sporting activities? !#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.6!06!099!! ! ! ! ! ! ! ! ! =2089:!6.!

/0461<1706:!

123

!9. How much do your whiplash symptoms interfere with non-sporting leisure activity? #!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.6!06!099!! ! ! ! ! ! ! ! ! =2089:!6.!

/0461<1706:!!

10. How much sadness/depression do you experience as a result of your whiplash injury/symptoms? #!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.2:!! ! ! ! ! ! ! ! BA64:C:!!

50D2:55KM:74:551.2!!

11. How much anger do you experience as a result of your whiplash injury/symptoms? !

#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.2:!! ! ! ! ! ! ! ! ! ! BA64:C:!02J:4!!12. How much anxiety do you experience as a result of your whiplash injury/symptoms? #!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.2:!! ! ! ! ! ! ! ! ! ! BA64:C:!02A1:6>!!13. How much difficulty do you have concentrating as a result of your whiplash injury/symptoms? #!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.!D1;;1<?96>!! ! ! ! ! ! ! ! ! =2089:!6.!! ! ! ! ! ! ! ! ! ! H.2<:26406:!$'" How much do your whiplash symptoms interfere with your personal care (washing, dressing, etc)?!#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.!06!099!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! ! ! ! !!!!!!!!! =2089:!6.!7:4;.4C!!$(" How much do your whiplash symptoms interfere with your work duties?!#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! ! =2089:!6.!7:4;.4C!!$)" How much do your whiplash symptoms interfere with your home duties?!#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.!06!099!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! ! =2089:!6.!7:4;.4C! 17. How much do your whiplash symptoms interfere with driving? #!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.!06!099!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! =2089:!6.!640@:9!12!!!!!!!!!!!!

<04! 18. How much do your whiplash symptoms interfere with using public transport? #!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.!06!099!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! ! =2089:!6.!?5:!7?891<!!

640257.46! 19. How tired do you feel as a result of your whiplash injury/symptoms? #!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.6!06!099!! ! ! ! ! ! BA64:C:!!

614:D2:55!099!6G:!61C:!!

124

! 20. How fatigued do you feel as a result of your whiplash injury/symptoms? #!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.6!06!099!! ! ! ! ! ! BA64:C:!!

;061J?:!099!6G:!61C:! 21. How much sadness do you experience as a result of your whiplash injury/symptoms? !#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.2:!! ! ! ! ! ! ! !!!!!!!!! BA64:C:!50D2:55! 22. How much depression do you experience as a result of your whiplash injury/symptoms? !#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.2:!! ! ! ! ! ! ! !!!!!!! BA64:C:!M:74:551.2!



125

A-1.3: Six-week Follow-up Questionnaire

The University Health Network WDQ Validation Study

Follow-Up Questionnaire (Six weeks)

STUDY NUMBER: _________________________ TODAY’S DATE: _______, _______, ________ (day) (month) (year) 6 week Follow-up - v. Apr29.08 – Page 125

126

We would like to remind you that this information is confidential and will not be released to AVIVA or anyone else. If you do not wish to answer a question, please tell me and we will go on to the next question. If you do not understand a question and/or instruction, please ask me to explain it to you.

127

SECTION C: In this section, we will be asking questions about your pain and its

intensity as caused by your accident-related injuries.

1. Which part(s) of your body was injured by the accident? (Check all that may apply)

Neck Face Abdomen/chest/groin Head Low back Leg(s) Shoulder(s) Hand(s) Arm(s) Foot/feet Mid back

2. Please rate your average neck pain in the past 24 hours on a pain scale of 0 to 10 where 0 means

no pain at all and 10 means pain as bad as it could be. Pain as bad No Pain as could be

0 1 2 3 4 5 6 7 8 9 10

3. Please rate your average shoulder pain in the past 24 hours on a pain scale of 0 to 10 where 0 means no pain at all and 10 means pain as bad as it could be.


0 1 2 3 4 5 6 7 8 9 10

4. Please rate your average low back pain in the past 24 hours on a pain scale of 0 to 10 where 0 means no pain at all and 10 means pain as bad as it could be.


0 1 2 3 4 5 6 7 8 9 10

5. Please rate your average headache pain in the past 24 hours on a pain scale of 0 to 10 where 0 means no pain at all and 10 means pain as bad as it could be.


0 1 2 3 4 5 6 7 8 9 10

6. Please rate your average arm pain in the past 24 hours on a pain scale of 0 to 10 where 0 means no pain at all and 10 means pain as bad as it could be.


0 1 2 3 4 5 6 7 8 9 10

7. Please rate your average hand pain in the past 24 hours on a pain scale of 0 to 10 where

0 means no pain at all and 10 means pain as bad as it could be. Pain as bad


128

1. Please rate your average face pain in the past 24 hours on a pain scale of 0 to 10 where 0 means no pain at all and 10 means pain as bad as it could be.


0 1 2 3 4 5 6 7 8 9 10 2. Please rate your average leg pain in the past 24 hours on a pain scale of 0 to 10 where 0 means no pain

at all and 10 means pain as bad as it could be. Pain as bad No Pain as could be

0 1 2 3 4 5 6 7 8 9 10 3. Please rate your average foot pain in the past 24 hours on a pain scale of 0 to 10 where 0 means no

pain at all and 10 means pain as bad as it could be. Pain as bad No Pain as could be

0 1 2 3 4 5 6 7 8 9 10 4. Please rate your average mid back pain in the past 24 hours on a pain scale of 0 to 10 where 0 means

no pain at all and 10 means pain as bad as it could be. Pain as bad No Pain as could be

0 1 2 3 4 5 6 7 8 9 10 5. Please rate your average abdomen/chest/groin pain in the past 24 hours on a pain scale of 0 to 10

where 0 means no pain at all and 10 means pain as bad as it could be. Pain as bad No Pain as could be

0 1 2 3 4 5 6 7 8 9 10 6. Did the accident cause any of the following symptoms? (check any that apply) Anxiety or worry Anger Concentration or attention problems Difficulty moving your neck Dizziness or unsteadiness Feeling of numbness, tingling or pain in arms or hands Feeling of numbness, tingling or pain in legs or feet Hearing problems Memory problems or forgetfulness Pain when your neck is moved Sleep problems Sore jaw Unusual fatigue or tiredness Vision problems

129

SECTION D: In this section, we will be asking you questions about past and current health issues that are unrelated to your recent accident.

1. In general, would you say your health is:

Excellent Very Good Good Fair Poor

2. Compared to one week ago, how would you rate your health in general now?

Much better than one week

ago

Somewhat better now than one week ago

About the same as one week

ago

Somewhat worse now than one week ago

Much worse now than one

week ago

3. The following questions are about activities you might do during a typical day. Does your health now

limit you in these activities? If so, how much?

Yes, limited a

lot

Yes, limited a

little

No, not limited at

all

a Vigorous activities, such as running, lifting heavy objects, participating in strenuous sports

b Moderate activities, such as moving a table, pushing a vacuum cleaner, bowling, or playing golf

c Lifting or carrying groceries d Climbing several flights of stairs e Climbing one flight of stairs f Bending, kneeling, or stopping g Walking more than a mile h Walking several hundred yards i Walking one hundred yards j Bathing or dressing yourself

4. During the past week, how much of the time have you had any of the following problems with your

work or other regular daily activities as a result of your physical health?

All of the

time

Most of the time

Some of the time


None of the time


b Accomplished less than you would like c Were limited in the kind of work or other activities

d Had difficulty performing the work or other activities (for example, it took extra effort)

130

5. During the past week, how much of the time have you had any of the following problems with your work or other regular daily activities as a result of any emotional problems (such as feeling depressed or anxious)?

6. During the past week, to what extent has your physical health or emotional problems interfered with

your normal social activities with family, friends, neighbours or groups?


7. How much bodily pain have you had during the past week?

None Very mild Mild Moderate Severe Very severe

8. During the past week, how much did pain interfere with your normal work (including both work

outside the home and housework)?


9. These questions are about how you have felt and how things have been with you during the past

week. For each question, please give the one answer that comes closest to the way you have been feeling.

How much of the time during the past week…

All of

the time

Most of the time

Some of the time


None of the time

a Did you feel full of life? b Have you been very nervous?

c Have you felt so down in the dumps that nothing could cheer you up?

d Have you felt calm and peaceful? e Did you have a lot of energy? f Have you felt downhearted and depressed? g Did you feel worn out? h Have you been happy? i Did you feel tired?

All of the

time

Most of the time

Some of the time


None of the time


b Accomplished less than you would like c Did work or activities less carefully than usual

131

10. During the past week, how much of the time has your physical health or emotional problems interfered with your social activities (like visiting friends, relatives, etc.)?

All of the time Most of the time Some of the time A little of the time None of the time

11. How TRUE or FALSE is each of the following statements for you?

Definitely true

Mostly true

Don’t know

Mostly false

Definitely false

a I seem to get sick a little easier than other people

b I am as healthy as anybody I know c I expect my health to get worse d My health is excellent

132

SECTION E: In this section, we will be asking you a question about your

expectations of recovery.

12. Do you think that your injury will…

get better soon get better slowly never get better don’t know

SECTION F: In this section, we will be asking you a question regarding how well

you believe your recovery is progressing.

1. How well do you feel you are recovering from your injuries?

Completely better Much improved Slightly improved No change Slightly worse Much worse Worse than ever

133

SECTION H: In this section, we will be asking you questions specifically about your daily activities and your feelings that may have been affected by your whiplash injuries.

Please circle a number in each section to indicate how you have been affected by the whiplash injury and symptoms. If one or more questions are not relevant to you, please leave that section blank.!!!2. How much pain do you have today? #!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.!/012!! ! ! ! ! ! ! ! ! 3.456!7012!IC0J12089:!!&" How much do your whiplash symptoms interfere with your personal care (washing, dressing, etc)?!#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.!06!099!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! =2089:!6.!7:4;.4C!!'" How much do your whiplash symptoms interfere with your work/home/study duties?!!#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.!06!099!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! =2089:!6.!7:4;.4C!!5. How much do your whiplash symptoms interfere with driving or using public transport? #!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.!06!099!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! =2089:!6.!640@:9!12!!!!!!!!!!!!

<04K?5:!7?891<!640257.46!!

)" How much do your whiplash symptoms interfere with sleep?!#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.6!06!099!! ! ! ! ! ! ! ! ! H022.6!59::7!!7. How tired/fatigued do you feel as a result of your whiplash injury/symptoms? #!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.6!06!099!! ! ! ! ! ! BA64:C:!!

614:D2:55K;061J?:!099!6G:!61C:!!

+" How much do your whiplash symptoms interfere with social activity?!#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.6!06!099!! ! ! ! ! ! ! ! ! =2089:!6.!5.<1091L:!!9. How much do your whiplash symptoms interfere with sporting activities? #!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.6!06!099!! ! ! ! ! ! ! ! ! =2089:!6.!

/0461<1706:!!

$#" How much do your whiplash symptoms interfere with non-sporting leisure activity?!#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.6!06!099!! ! ! ! ! ! ! ! ! =2089:!6.!

/0461<1706:!!

11. How much sadness/depression do you experience as a result of your whiplash injury/symptoms? #!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.2:!! ! ! ! ! ! ! ! BA64:C:!!

50D2:55KM:74551.2!

134

$%" How much anger do you experience as a result of your whiplash injury/symptoms?!#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.2:!! ! ! ! ! ! ! ! ! ! BA64:C:!02J:4!!$&" How much anxiety do you experience as a result of your whiplash injury/symptoms?!#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.2:!! ! ! ! ! ! ! ! ! ! BA64:C:!02A1:6>!!14. How much difficulty do you have concentrating as a result of your whiplash injury/symptoms? #!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.!D1;;1<?96>!! ! ! ! ! ! ! ! ! =2089:!6.!<.2<:26406:!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

135

!!SECTION I: In this section, we will be asking you questions about your mood.

Below is a list of some of the ways you may have felt or behaved. Please indicate how often you have felt this way during the past week by checking the appropriate space. Rarely or

none of the time (less than 1 day)

Some or a little of the time (1-2 days)

Occasionally or a moderate amount of time (3-4 days)

Most or all of the time (5-7 days)

1. I was bothered by things that usually don't bother me. 0 1 2 3

2. I did not feel like eating; my appetite was poor. 0 1 2 3

3. I felt that I could not shake off the blues even with help from my family or friends.

0 1 2 3

4. I felt that I was just as good as other people. 0 1 2 3

5. I had trouble keeping my mind on what I was doing. 0 1 2 3

6. I felt depressed. 0 1 2 3 7. I felt that everything I did was an effort. 0 1 2 3

8. I felt hopeful about the future. 0 1 2 3 9. I thought my life had been a failure. 0 1 2 3 10. I felt fearful. 0 1 2 3 11. My sleep was restless. 0 1 2 3 12. I was happy. 0 1 2 3 13. I talked less than usual. 0 1 2 3 14. I felt lonely. 0 1 2 3 15. People were unfriendly. 0 1 2 3 16. I enjoyed life. 0 1 2 3 17. I had crying spells. 0 1 2 3 18. I felt sad. 0 1 2 3 19. I felt that people disliked me. 0 1 2 3 20. I could not get "going." 0 1 2 3

136

SECTION J: In this last section, we would like to know a little about you.

1. Have you hired a lawyer or paralegal to help you with your claim?

No Yes

2. Are you currently working? No Yes



Thank you for participating in this part of the study.

137

A-1.4: Addition to WIT Baseline Questionnaire

Baseline Study ID: _____________

THIS QUESTIONNAIRE IS DESIGNED TO HELP US BETTER UNDERSTAND HOW YOUR NECK PAIN AFFECTS YOUR ABILITY TO MANAGE EVERYDAY -LIFE ACTIVITIES. PLEASE MARK IN EACH SECTION THE ONE BOX THAT APPLIES TO YOU. ALTHOUGH YOU MAY CONSIDER THAT TWO OF THE STATEMENTS IN ANY ONE SECTION RELATE TO YOU, PLEASE MARK THE BOX THAT MOST CLOSELY DESCRIBES YOUR PRESENT -DAY SITUATION.

SECTION 1: Pain Intensity I have no pain at the moment The pain is very mild at the moment The pain is moderate at the moment The pain is fairly severe at the moment The pain is very severe at the moment The pain is the worst imaginable at the moment

SECTION 2: Personal Care (Washing, Dressing etc.)

I can look after myself normally without causing extra pain I can look after myself normally but it causes extra pain It is painful to look after myself and I am slow and careful I need some help but manage most of my personal care I need help every day in most aspects of my personal care I do not get dressed, I wash with difficulty, and stay in bed

SECTION 3: Lifting

I can lift heavy weights without extra pain I can lift heavy weights but it causes extra pain Pain prevents me from lifting heavy weights off the floor but I can manage if they are conveniently positioned (e.g. on a table) Pain prevents me from lifting heavy weights, but I can manage light to medium weights if they are conveniently positioned I can lift only very light weights I cannot lift or carry anything at all

SECTION 4: Reading

I can read as much as I want with no pain in my neck I can read as much as I want with slight pain in my neck I can read as much as I want with moderate pain in my neck I cannot read as much as I want because of moderate pain in my neck I can hardly read at all because of severe pain in my neck I cannot read at all

Addition to WIT baseline - v. Oct06.08 – Page 137

SECTION H: In this section, we will be asking you questions regarding your neck pain and how it affects your everyday life.

138

!"#$%&&'(')*$()$+,-$!.'/012.$3*(4564*(')*$75'10$

SECTION 5: Headache I have no headaches at all I have slight headaches which occur infrequently I have moderate headaches which occur infrequently I have moderate headaches which occur frequently I have severe headaches which occur frequently I have headaches almost all the time

SECTION 6: Concentration

I can concentrate fully when I want with no difficulty I can concentrate fully when I want to with slight difficulty I have a fair degree of difficulty in concentrating when I want to I have a lot of difficulty in concentrating when I want to I have a great deal of difficulty in concentrating when I want to I cannot concentrate at all

SECTION 7: Work

I can do as much work as I want to I can do my usual work, but no more I can do most of my usual work, but not all I cannot do my usual work I can hardly do any work at all I cannot do any work at all

SECTION 8: Driving

I can drive my car without any neck pain I can drive my car as long as I want with slight pain in my neck I can drive my car as long as I want with moderate pain in my neck I cannot drive my car as long as I want because of moderate pain in my neck I can hardly drive at all because of severe pain in my neck I cannot drive at all

SECTION 9: Sleeping

I have no trouble sleeping My sleep is slightly disturbed (less than 1 hour sleepless) My sleep is mildly disturbed (1-2 hours sleepless) My sleep is moderately disturbed (2-3 hours sleepless) My sleep is greatly disturbed (3-5 hours sleepless) My sleep is completely disturbed (5-7 hours sleepless)

SECTION 10: Recreation

I am able to engage in all my recreation activities with no neck pain at all I am able to engage in all my recreation activities with some neck pain I am able to engage in most, but not all, my recreation activities because of pain in my neck I am able to engage in few of my recreation activities because of pain in my neck I can hardly do any recreation activities because of pain in my neck I cannot do any recreation activities at all


139

!"#$%&&'(')*$()$+,-$!.'/012.$3*(4564*(')*$75'10$ SECTION I:

The following scales have been designed to find out about your neck pain and how it’s affecting you. Please answer ALL the scales, and mark ONE number on EACH scale that best describes how you feel. 8. Over the past week, on average, how would you rate your neck pain? #!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.!/012! ! ! ! ! ! ! ! ! 3.456!7012!7.55189:!!9. Over the past week, how much has your neck pain interfered with your daily activities (housework,

washing, dressing, lifting, reading, driving)?!#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.!126:4;:4:2<:! ! ! ! ! ! ! =2089:!6.!<044>!.?6!0<61@16>!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!10. Over the past week, how much has your neck pain interfered with your ability to take part in

recreational, social, and family activities?!!!#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.!126:4;:4:2<:! ! ! ! ! ! ! =2089:!6.!<044>!.?6!0<61@16>!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!11. Over the past week, how anxious (tense, uptight, irritable, difficulty in concentrating/relaxing) have

you been feeling? #!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.6!06!099!02A1.?5!! ! ! ! ! ! ! ! BA64:C:9>!02A1.?5!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!12. Over the past week, how depressed (down-in-the-dumps, sad, in low spirits, pessimistic, unhappy)

have you been feeling?!#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.6!06!099!D:74:55:D!!!! ! ! ! ! ! ! ! BA64:C:9>!D:74:55:D!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!13. Over the past week, how have you felt your work (both inside and outside the home) has affected (or

would affect) your neck pain? !#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!E0@:!C0D:!16!2.!F.45:! ! ! ! ! E0@:!C0D:!16!C?<G!F.45:!

!!

14. Over the past week, how much have you been able to control (reduce/help) your neck pain on your own?!

#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!H.C79:6:9>!<.264.9!16! ! ! ! ! ! !! ! -.!<.264.9!FG065.:@:4!!Addition to WIT baseline - v. Oct06.08 – Page 3 $$$

140

$!"#$%&&'(')*$()$+,-$!.'/012.$3*(4564*(')*$75'10$ $'" How much do your whiplash symptoms interfere with your personal care (washing, dressing, etc)?!#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.!06!099!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! ! ! ! !!!!!!!!! =2089:!6.!7:4;.4C!!$(" How much do your whiplash symptoms interfere with your work duties?!#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! ! =2089:!6.!7:4;.4C!!$)" How much do your whiplash symptoms interfere with your home duties?!#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.!06!099!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! ! =2089:!6.!7:4;.4C! 17. How much do your whiplash symptoms interfere with driving? #!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.!06!099!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! =2089:!6.!640@:9!12!!!!!!!!!!!!

<04! 18. How much do your whiplash symptoms interfere with using public transport? #!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.!06!099!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! ! =2089:!6.!?5:!7?891<!!

640257.46! 19. How tired do you feel as a result of your whiplash injury/symptoms? #!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.6!06!099!! ! ! ! ! ! BA64:C:!!

614:D2:55!099!6G:!61C:!!

20. How fatigued do you feel as a result of your whiplash injury/symptoms? #!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.6!06!099!! ! ! ! ! ! BA64:C:!!

;061J?:!099!6G:!61C:! 21. How much sadness do you experience as a result of your whiplash injury/symptoms? !#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.2:!! ! ! ! ! ! ! !!!!!!!!! BA64:C:!50D2:55! 22. How much depression do you experience as a result of your whiplash injury/symptoms? !#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.2:!! ! ! ! ! ! ! !!!!!!! BA64:C:!M:74:551.2!

141

Addition to WIT baseline - v. Oct06.08 – Page 4 !"#$%&&'(')*$()$+,-$!.'/012.$3*(4564*(')*$75'10$ SECTION G: In this section, we will be asking you a question regarding how well





142

A-1.5: Addition to WIT Six-week Follow-up Questionnaire $!"#$%&&'(')*$()$+,-$!.'/012.$3*(4564*(')*$75'10$ Six-week Follow-up Study ID: _____________ SECTION G: In this section, we will be asking you a question regarding how well


4. How do you feel your neck pain has changed since the injury? Very much better Better Slightly better No change Slightly worse Worse Very much worse

(" How much do your whiplash symptoms interfere with your personal care (washing, dressing, etc)?!#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.!06!099!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! ! ! ! !!!!!!!! =2089:!6.!7:4;.4C!!

)" How much do your whiplash symptoms interfere with your work duties?!#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! ! =2089:!6.!7:4;.4C!!

*" How much do your whiplash symptoms interfere with your home duties?!#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.!06!099!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! ! =2089:!6.!7:4;.4C!

8. How much do your whiplash symptoms interfere with driving? #!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.!06!099!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! =2089:!6.!640@:9!12!!!!!!!!!!!!

<04!

9. How much do your whiplash symptoms interfere with using public transport? #!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.!06!099!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! ! =2089:!6.!?5:!7?891<!!

640257.46!

10. How tired do you feel as a result of your whiplash injury/symptoms? #!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.6!06!099!! ! ! ! ! ! BA64:C:!!

614:D2:55!099!6G:!61C:!

143

Addition to WIT 6wk follow-up - v. Oct06.08 – Page 143 !"#$%&&'(')*$()$+,-$!.'/012.$3*(4564*(')*$75'10$

11. How fatigued do you feel as a result of your whiplash injury/symptoms? #!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.6!06!099!! ! ! ! ! ! BA64:C:!!

;061J?:!099!6G:!61C:!

12. How much sadness do you experience as a result of your whiplash injury/symptoms? !#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.2:!! ! ! ! ! ! ! !!!!!!!!! BA64:C:!50D2:55!

13. How much depression do you experience as a result of your whiplash injury/symptoms? !#!! $!! %!! &!! '!! (!! )!! *!! +!! ,!! $#!-.2:!! ! ! ! ! ! ! !!!!!!! BA64:C:!M:74:551.2! Addition to WIT 6wk follow-up - v. Oct06.08 – Page 2

144

Appendix 2: Ethics Certificates

A-2.1: University Health Network Ethics Approval

145

A-2.1: University Health Network Ethics Approval (continued)

146

A-2.2: University of Toronto Ethics Approval

147

Appendix 3: Baseline WDQ Distributions

154

Appendix 4: COSMIN Checklist completed with criteria relevant to this

thesis

COSMIN checklist with 4-point scale Contact !"#$%&'%%(#)*+#,-#-./0%&1/23#4%5/678#!%.2%&#+%97&2:%.2#;<#=9/5%:/;8;>3#7.5#"/;1272/12/61#=4?@#A.12/2B2%#<;&#C%782*#7.5#!7&%#D%1%7&6*#EFGE#"$#H:12%&57:#$*%#I%2*%&87.51#J%K1/2%L#'''M6;1:/.M.8(#'''M%:>;M.8##=N:7/8L#6KM2%&'%%O0B:6M.8#

Instructions

$*/1#0%&1/;.#;<#2*%#!@P4AI#6*%6Q8/12#/1#&%6;::%.5%5#<;&#B1%#/.#1312%:72/6#&%0/%'1#;<#:%71B&%:%.2#9&;9%&2/%1M#J/2*#2*/1#0%&1/;.#/2#/1#9;11/K8%#2;#6786B872%#

;0%&788#:%2*;5;8;>/678#RB78/23#16;&%1#9%&#12B53#;.#7#:%71B&%:%.2#9&;9%&23M#H#:%2*;5;8;>/678#RB78/23#16;&%#9%&#K;S#/1#;K27/.%5#K3#27Q/.>#2*%#8;'%12#&72/.>#;<#

7.3#/2%:#/.#7#K;S#TU';&1%#16;&%#6;B.21VWM#X;&#%S7:98%(#/<#<;&#7#&%8/7K/8/23#12B53#;.%#/2%:#/.#2*%#K;S#UD%8/7K/8/23V#/1#16;&%5#9;;&(#2*%#:%2*;5;8;>/678#RB78/23#;<#2*72#

&%8/7K/8/23#12B53#/1#&72%5#71#9;;&M#$*%#A.2%&9&%27K/8/23#K;S#7.5#2*%#?%.%&78/Y7K/8/23#K;S#7&%#:7/.83#B1%5#71#5727#%S2&762/;.#<;&:1M#J%#&%6;::%.5#2;#B1%#2*%#

A.2%&9&%27K/8/23#K;S#2;#%S2&762#788#/.<;&:72/;.#;.#2*%#/.2%&9&%27K/8/23#/11B%1#5%16&/K%5#/.#2*/1#K;S#T%M>M#.;&:#16;&%1(#<8;;&N6%/8/.>#%<<%621(#:/./:78#/:9;&27.2#

6*7.>%W#;<#2*%#/.12&B:%.21#B.5%&#12B53#<&;:#2*%#/.68B5%5#7&2/68%1M#P/:/87&(#'%#&%6;::%.5#2;#B1%#2*%#?%.%&78/Y7K/8/23#K;S#2;#%S2&762#5727#;.#2*%#6*7&762%&/12/61#

;<#2*%#12B53#9;9B872/;.#7.5#17:98/.>#9&;6%5B&%M#$*%&%<;&%#.;#16;&/.>#1312%:#'71#5%0%8;9%5#<;&#2*%1%#K;S%1M#

#

$*/1#16;&/.>#1312%:#/1#5%16&/K%5#/.#2*/1#979%&L#

#

$%&'%%#!"(#4;QQ/.Q#Z"(#[.;8#+Z(#@12%8;#DJ\?(#";B2%&#Z4(#5%#,%2#C!JM#D72/.>#2*%#:%2*;5;8;>/678#RB78/23#/.#1312%:72/6#&%0/%'1#;<#12B5/%1#;.#

:%71B&%:%.2#9&;9%&2/%1L#7#16;&/.>#1312%:#<;&#2*%#!@P4AI#6*%6Q8/12M#]B78/23#;<#Z/<%#D%1%7&6*#^FEE(#\B83#_#`%9BK#7*%75#;<#9&/.2aM#

#

155

http://www.cosmin.nl/

http://www.emgo.nl/

mailto:[email protected]

Step 1. Evaluated measurement properties in the article

#

# A.2%&.78#6;.1/12%.63# ";S#H#

# D%8/7K/8/23# ";S#"#

# 4%71B&%:%.2#%&&;&# ";S#!#

# !;.2%.2#078/5/23# ";S#+#

# P2&B62B&78#078/5/23# ";S#=#

# C39;2*%1%1#2%12/.># ";S#X#

# !&;11N6B82B&78#078/5/23# ";S#?#

# !&/2%&/;.#078/5/23# ";S#C#

# D%19;.1/0%.%11# ";S#A#

156

Maja

Maja

✔

Maja

✔

Maja

✔

Maja

Maja

N/A

Maja

✔

Maja

✔

Maja

N/A

Maja

N/A

Maja

✔

Step 2. Determining if the statistical method used in the article are based on CTT or IRT

Box General requirements for studies that applied Item Response Theory (IRT) models excellent good fair poor E# J71#2*%#AD$#:;5%8#B1%5#75%RB72%83#5%16&/K%5b#%M>M#@.%#)7&7:%2%&#Z;>/12/6#4;5%8#

T@)Z4W(#)7&2/78#!&%5/2#4;5%8#T)!4W(#?&75%5#D%19;.1%#4;5%8#T?D4W#

AD$#:;5%8#75%RB72%83#5%16&/K%5#

AD$#:;5%8#.;2#75%RB72%83#5%16&/K%5##

# #

# # # # # #^# J71#2*%#6;:9B2%<2'7&%#976Q7>%#B1%5#75%RB72%83#5%16&/K%5b#%M>M#D-44^F^F(#

JAIP$=)P(#@)Z4(#4-Z$AZ@?(#)HDP!HZ=(#"AZ@?(#IZ4Ac=+#

P;<2'7&%#976Q7>%#75%RB72%83#5%16&/K%5##

P;<2'7&%#976Q7>%#.;2#75%RB72%83#5%16&/K%5#

# #

# # # # # #d# J71#2*%#:%2*;5#;<#%12/:72/;.#B1%5#75%RB72%83#5%16&/K%5b#%M>M#6;.5/2/;.78#

:7S/:B:#8/Q%8/*;;5#T!4ZW(#:7&>/.78#:7S/:B:#8/Q%8/*;;5#T44ZW##

4%2*;5#;<#%12/:72/;.#75%RB72%83#5%16&/K%5#

4%2*;5#;<#%12/:72/;.#.;2#75%RB72%83#5%16&/K%5#

# #

# # # # # #e# J%&%#2*%#711B:92/;.1#<;&#%12/:72/.>#97&7:%2%&1#;<#2*%#AD$#:;5%8#6*%6Q%5b#%M>M#

B./5/:%.1/;.78/23(#8;678#/.5%9%.5%.6%(#7.5#/2%:#</2#T%M>M#5/<<%&%.2/78#/2%:#<B.62/;./.>#

T+AXWW#

711B:92/;.1#;<#2*%#AD$#:;5%8#6*%6Q%5#

711B:92/;.1#;<#2*%#AD$#:;5%8#97&283#6*%6Q%5#

711B:92/;.1#;<#2*%#AD$#:;5%8#.;2#6*%6Q%5#;&#B.Q.;'.#

#

$;#;K27/.#7#2;278#16;&%#<;&#2*%#:%2*;5;8;>/678#RB78/23#;<#12B5/%1#2*72#B1%#AD$#:%2*;51(#2*%#U';&1%#16;&%#6;B.21V#78>;&/2*:#1*;B85#K%#7998/%5#2;#

2*%#AD$#K;S#/.#6;:K/.72/;.#'/2*#2*%#K;S#;<#2*%#:%71B&%:%.2#9&;9%&23#2*72#'71#%078B72%5#/.#2*%#AD$#12B53M#X;&#%S7:98%(#/<#AD$#:%2*;51#7&%#

B1%5#2;#12B53#/.2%&.78#6;.1/12%.63#7.5#/2%:#e#/.#2*%#AD$#K;S#/1#16;&%5#<7/&(#'*/8%#2*%#/2%:1#/.#2*%#/.2%&.78#6;.1/12%.63#K;S#TK;S#HW#7&%#788#16;&%5#

71#>;;5#;&#%S6%88%.2(#2*%#:%2*;5;8;>/678#RB78/23#16;&%#<;&#/.2%&.78#6;.1/12%.63#'/88#K%#<7/&M#C;'%0%&(#/<#7.3#;<#2*%#/2%:1#/.#K;S#H#/1#16;&%5#9;;&(#

2*%#:%2*;5;8;>/678#RB78/23#16;&%#<;&#/.2%&.78#6;.1/12%.63#'/88#K%#9;;&M#

157

Maja

N/A

Maja

Step 3. Determining if a study meets the standards for good methodological quality#

Box A. Internal consistency# excellent good fair poor E# +;%1#2*%#1678%#6;.1/12#;<#%<<%62#/.5/672;&1(#/M%M#/1#/2#K71%5#;.#7#&%<8%62/0%#:;5%8b#

#Design requirements # # # # # #^# J71#2*%#9%&6%.27>%#;<#:/11/.>#/2%:1#>/0%.b# )%&6%.27>%#;<#

:/11/.>#/2%:1#5%16&/K%5#

)%&6%.27>%#;<#:/11/.>#/2%:1#I@$#5%16&/K%5##

# #

d# J71#2*%&%#7#5%16&/92/;.#;<#*;'#:/11/.>#/2%:1#'%&%#*7.58%5b# +%16&/K%5#*;'#:/11/.>#/2%:1#'%&%#*7.58%5#

I;2#5%16&/K%5#KB2#/2#67.#K%#5%5B6%5#*;'#:/11/.>#/2%:1#'%&%#*7.58%5##

I;2#68%7&#*;'#:/11/.>#/2%:1#'%&%#*7.58%5#

#

e# J71#2*%#17:98%#1/Y%#/.68B5%5#/.#2*%#/.2%&.78#6;.1/12%.63#7.7831/1#75%RB72%b# H5%RB72%#17:98%#1/Y%#T EFFW#

?;;5#17:98%#1/Y%#TfFNggW##

4;5%&72%#17:98%#1/Y%#TdFNegW#

P:788#17:98%#1/Y%#ThdFW#

f# J71#2*%#B./5/:%.1/;.78/23#;<#2*%#1678%#6*%6Q%5b#/M%M#'71#<762;&#7.7831/1#;&#AD$#:;5%8#7998/%5b#

X762;&#7.7831/1#9%&<;&:%5#/.#2*%#12B53#9;9B872/;.#

HB2*;&1#&%<%#7.;2*%&#12B53#/.#'*/6*#<762;&#7.7831/1#'71#9%&<;&:%5#/.#7#1/:/87&#12B53#9;9B872/;.#

HB2*;&1#&%<%#7.;2*%&#12B53#/.#'*/6*#<762;&#7.7831/1#'71#9%&<;&:%5(#KB2#.;2#/.#7#1/:/87&#12B53#9;9B872/;.#

X762;&#7.7831/1#I@$#9%&<;&:%5#7.5#.;#&%<%&%.6%#2;#7.;2*%&#12B53#

# # # # # #_# J71#2*%#17:98%#1/Y%#/.68B5%5#/.#2*%#B./5/:%.1/;.78/23#7.7831/1#75%RB72%b# ij#k/2%:1#7.5#

EFF##fj#k/2%:1#7.5#EFF#@D#_Nij#k/2%:1#KB2#hEFF##

fj#k/2%:1#KB2#hEFF#

hfj#k/2%:1#

158

Maja

effect and causal indicators

Maja

Maja

Maja

Maja

Maja

i# J71#7.#/.2%&.78#6;.1/12%.63#1272/12/6#6786B872%5#<;&#%76*#TB./5/:%.1/;.78W#T1BKW1678%#1%97&72%83b#

A.2%&.78#6;.1/12%.63#1272/12/6#6786B872%5#<;&#%76*#1BK1678%#1%97&72%83#

# # A.2%&.78#6;.1/12%.63#1272/12/6#I@$#6786B872%5#<;&#%76*#1BK1678%#1%97&72%83##

G# J%&%#2*%&%#7.3#/:9;&27.2#<87'1#/.#2*%#5%1/>.#;&#:%2*;51#;<#2*%#12B53b##

I;#;2*%&#/:9;&27.2#:%2*;5;8;>/678#<87'1#/.#2*%#5%1/>.#;&#%S%6B2/;.#;<#2*%#12B53#

# @2*%&#:/.;&#:%2*;5;8;>/678#<87'1#/.#2*%#5%1/>.#;&#%S%6B2/;.#;<#2*%#12B53#

@2*%&#/:9;&27.2#:%2*;5;8;>/678#<87'1#/.#2*%#5%1/>.#;&#%S%6B2/;.#;<#2*%#12B53#

Statistical methods # # # # # #g# <;&#!8711/678#$%12#$*%;&3#T!$$W(#6;.2/.B;B1#16;&%1L#J71#!&;.K76*V1#789*7#

6786B872%5b#!&;.K76*V1#789*7#6786B872%5#

# @.83#/2%:N2;278#6;&&%872/;.1#6786B872%5#

I;#!&;.K76*V1#789*7#7.5#.;#/2%:N2;278#6;&&%872/;.1#6786B872%5##

EF# <;&#!$$(#5/6*;2;:;B1#16;&%1L#J71#!&;.K76*V1#789*7#;&#[DN^F#6786B872%5b# !&;.K76*V1#789*7#;&#[DN^F#6786B872%5#

# @.83#/2%:N2;278#6;&&%872/;.1#6786B872%5#

I;#!&;.K76*V1#789*7#;&#[DN^F#7.5#.;#/2%:N2;278#6;&&%872/;.1#6786B872%5##

EE# <;&#AD$L#J71#7#>;;5.%11#;<#</2#1272/12/6#72#7#>8;K78#8%0%8#6786B872%5b#=M>M# ^(#&%8/7K/8/23#6;%<</6/%.2#;<#%12/:72%5#872%.2#2&7/2#078B%#T/.5%S#;<#T1BKl%62#;&#/2%:W#1%97&72/;.W##

?;;5.%11#;<#</2#1272/12/6#72#7#>8;K78#8%0%8#6786B872%5#

# # ?;;5.%11#;<#</2#1272/12/6#72#7#>8;K78#8%0%8#I@$#6786B872%5#

#I"M#A2%:#E#/1#B1%5#2;#5%2%&:/.%#'*%2*%&#/.2%&.78#6;.1/12%.63#/1#&%8%07.2#<;&#2*%#/.12&B:%.2#B.5%&#12B53M#A2#/1#.;2#B1%5#2;#&72%#2*%#RB78/23#;<#2*%#12B53M#

159

Maja

N/A

Maja

N/A

Maja

Maja

Maja

Maja

Box B. Reliability: relative measures (including test-retest reliability, inter-rater reliability and intra-rater reliability)#

excellent good fair poor Design requirements

E# J71#2*%#9%&6%.27>%#;<#:/11/.>#/2%:1#>/0%.b# )%&6%.27>%#;<#:/11/.>#/2%:1#5%16&/K%5#

)%&6%.27>%#;<#:/11/.>#/2%:1#I@$#5%16&/K%5##

# #

^# J71#2*%&%#7#5%16&/92/;.#;<#*;'#:/11/.>#/2%:1#'%&%#*7.58%5b# +%16&/K%5#*;'#:/11/.>#/2%:1#'%&%#*7.58%5#

I;2#5%16&/K%5#KB2#/2#67.#K%#5%5B6%5#*;'#:/11/.>#/2%:1#'%&%#*7.58%5##

I;2#68%7&#*;'#:/11/.>#/2%:1#'%&%#*7.58%5#

#

d# J71#2*%#17:98%#1/Y%#/.68B5%5#/.#2*%#7.7831/1#75%RB72%b# H5%RB72%#17:98%#1/Y%#T EFFW##

?;;5#17:98%#1/Y%#TfFNggW#

4;5%&72%#17:98%#1/Y%#TdFNegW#

P:788#17:98%#1/Y%#ThdFW#

e# J%&%#72#8%712#2';#:%71B&%:%.21#707/87K8%b# H2#8%712#2';#:%71B&%:%.21##

# # @.83#;.%#:%71B&%:%.2#

f# J%&%#2*%#75:/./12&72/;.1#/.5%9%.5%.2b# A.5%9%.5%.2#:%71B&%:%.21#

H11B:7K8%#2*72#2*%#:%71B&%:%.21#'%&%#/.5%9%.5%.2#

+;BK2<B8#'*%2*%&#2*%#:%71B&%:%.21#'%&%#/.5%9%.5%.2##

:%71B&%:%.21#I@$#/.5%9%.5%.2#

_# J71#2*%#2/:%#/.2%&078#1272%5b# $/:%#/.2%&078#1272%5#

# $/:%#/.2%&078#I@$#1272%5##

#

i# J%&%#972/%.21#127K8%#/.#2*%#/.2%&/:#9%&/;5#;.#2*%#6;.12&B62#2;#K%#:%71B&%5b# )72/%.21#'%&%#127K8%#T%0/5%.6%#9&;0/5%5W#

H11B:7K8%#2*72#972/%.21#'%&%#127K8%##

-.68%7&#/<#972/%.21#'%&%#127K8%#

)72/%.21#'%&%#I@$#127K8%#

G# J71#2*%#2/:%#/.2%&078#799&;9&/72%b# $/:%#/.2%&078#799&;9&/72%#

# +;BK2<B8#'*%2*%&#2/:%#/.2%&078#'71#799&;9&/72%##

$/:%#/.2%&078#I@$#799&;9&/72%#

160

Maja

a sample size of 66 is adequate for a relaibility study (≥100 should not be expected)

Maja

Maja

Maja

Maja

Maja

Maja

Maja

Maja

Maja

g# J%&%#2*%#2%12#6;.5/2/;.1#1/:/87&#<;&#K;2*#:%71B&%:%.21b#%M>M#239%#;<#75:/./12&72/;.(#%.0/&;.:%.2(#/.12&B62/;.1#

$%12#6;.5/2/;.1#'%&%#1/:/87&#T%0/5%.6%#9&;0/5%5W##

H11B:7K8%#2*72#2%12#6;.5/2/;.1#'%&%#1/:/87&#

-.68%7&#/<#2%12#6;.5/2/;.1#'%&%#1/:/87&#

$%12#6;.5/2/;.1#'%&%#I@$#1/:/87&#

EF# J%&%#2*%&%#7.3#/:9;&27.2#<87'1#/.#2*%#5%1/>.#;&#:%2*;51#;<#2*%#12B53b# I;#;2*%&#/:9;&27.2#:%2*;5;8;>/678#<87'1#/.#2*%#5%1/>.#;&#%S%6B2/;.#;<#2*%#12B53#

# @2*%&#:/.;&#:%2*;5;8;>/678#<87'1#/.#2*%#5%1/>.#;&#%S%6B2/;.#;<#2*%#12B53#

@2*%&#/:9;&27.2#:%2*;5;8;>/678#<87'1#/.#2*%#5%1/>.#;&#%S%6B2/;.#;<#2*%#12B53#

Statistical methods

EE# <;.2/.B;B1#16;&%1L#J71#7.#/.2&768711#6;&&%872/;.#6;%<</6/%.2#TA!!W#6786B872%5b# A!!#6786B872%5#7.5#:;5%8#;&#<;&:B87#;<#2*%#A!!#/1#5%16&/K%5#

A!!#6786B872%5#KB2#:;5%8#;&#<;&:B87#;<#2*%#A!!#.;2#5%16&/K%5#;&#.;2#;92/:78M#)%7&1;.#;&#P9%7&:7.#6;&&%872/;.#6;%<</6/%.2#6786B872%5#'/2*#%0/5%.6%#9&;0/5%5#2*72#.;#1312%:72/6#6*7.>%#*71#;66B&&%5#

)%7&1;.#;&#P9%7&:7.#6;&&%872/;.#6;%<</6/%.2#6786B872%5#JA$C@-$#%0/5%.6%#9&;0/5%5#2*72#.;#1312%:72/6#6*7.>%#*71#;66B&&%5#;&#JA$C#%0/5%.6%#2*72#1312%:72/6#6*7.>%#*71#;66B&&%5##

I;#A!!#;&#)%7&1;.#;&#P9%7&:7.#6;&&%872/;.1#6786B872%5#

E^# <;&#5/6*;2;:;B1m.;:/.78m;&5/.78#16;&%1L#J71#Q7997#6786B872%5b# [7997#6786B872%5# # # @.83#9%&6%.27>%#7>&%%:%.2#6786B872%5##

Ed# <;&#;&5/.78#16;&%1L#J71#7#'%/>*2%5#Q7997#6786B872%5b# J%/>*2%5#[7997#6786B872%5#

# -.'%/>*2%5#[7997#6786B872%5#

@.83#9%&6%.27>%#7>&%%:%.2#6786B872%5##

Ee# <;&#;&5/.78#16;&%1L#J71#2*%#'%/>*2/.>#16*%:%#5%16&/K%5b#%M>M#8/.%7&(#RB75&72/6# J%/>*2/.>#16*%:%#5%16&/K%5#

J%/>*2/.>#16*%:%#I@$#5%16&/K%5#

# #

#

161

Maja

N/A

Maja

N/A

Maja

Maja

N/A

Maja

Maja

interviewer administered questionnaire in-person and by phone

Maja

Maja

Maja

Maja

Maja

Box C. Measurement error: absolute measures#


E# J71#2*%#9%&6%.27>%#;<#:/11/.>#/2%:1#>/0%.b# )%&6%.27>%#;<#:/11/.>#/2%:1#5%16&/K%5#

)%&6%.27>%#;<#:/11/.>#/2%:1#I@$#5%16&/K%5##

# #

^# J71#2*%&%#7#5%16&/92/;.#;<#*;'#:/11/.>#/2%:1#'%&%#*7.58%5b# +%16&/K%5#*;'#:/11/.>#/2%:1#'%&%#*7.58%5#

I;2#5%16&/K%5#KB2#/2#67.#K%#5%5B6%5#*;'#:/11/.>#/2%:1#'%&%#*7.58%5##

I;2#68%7&#*;'#:/11/.>#/2%:1#'%&%#*7.58%5#

#

d# J71#2*%#17:98%#1/Y%#/.68B5%5#/.#2*%#7.7831/1#75%RB72%b# H5%RB72%#17:98%#1/Y%#T EFFW#

?;;5#17:98%#1/Y%#TfFNggW#

4;5%&72%#17:98%#1/Y%#TdFNegW#

P:788#17:98%#1/Y%#ThdFW##

e# J%&%#72#8%712#2';#:%71B&%:%.21#707/87K8%b# H2#8%712#2';#:%71B&%:%.21##

# # @.83#;.%#:%71B&%:%.2#

f# J%&%#2*%#75:/./12&72/;.1#/.5%9%.5%.2b# A.5%9%.5%.2#:%71B&%:%.21#

H11B:7K8%#2*72#2*%#:%71B&%:%.21#'%&%#/.5%9%.5%.2#

+;BK2<B8#'*%2*%&#2*%#:%71B&%:%.21#'%&%#/.5%9%.5%.2##

:%71B&%:%.21#I@$#/.5%9%.5%.2#

_# J71#2*%#2/:%#/.2%&078#1272%5b# $/:%#/.2%&078#1272%5#

# $/:%#/.2%&078#I@$#1272%5##

#

i# J%&%#972/%.21#127K8%#/.#2*%#/.2%&/:#9%&/;5#;.#2*%#6;.12&B62#2;#K%#:%71B&%5b# )72/%.21#'%&%#127K8%#T%0/5%.6%#9&;0/5%5W#

H11B:7K8%#2*72#972/%.21#'%&%#127K8%##

-.68%7&#/<#972/%.21#'%&%#127K8%#

)72/%.21#'%&%#I@$#127K8%#

G# J71#2*%#2/:%#/.2%&078#799&;9&/72%b# $/:%#/.2%&078#799&;9&/72%#

# +;BK2<B8#'*%2*%&#2/:%#/.2%&078#'71#799&;9&/72%##

$/:%#/.2%&078#I@$#799&;9&/72%#

162

Maja

same as for the reliability study

Maja

Maja

Maja

Maja

Maja

Maja

Maja

Maja

g# J%&%#2*%#2%12#6;.5/2/;.1#1/:/87&#<;&#K;2*#:%71B&%:%.21b#%M>M#239%#;<#75:/./12&72/;.(#%.0/&;.:%.2(#/.12&B62/;.1#

$%12#6;.5/2/;.1#'%&%#1/:/87&#T%0/5%.6%#9&;0/5%5W##

H11B:7K8%#2*72#2%12#6;.5/2/;.1#'%&%#1/:/87&#

-.68%7&#/<#2%12#6;.5/2/;.1#'%&%#1/:/87&#

$%12#6;.5/2/;.1#'%&%#I@$#1/:/87&#

EF# J%&%#2*%&%#7.3#/:9;&27.2#<87'1#/.#2*%#5%1/>.#;&#:%2*;51#;<#2*%#12B53b# I;#;2*%&#/:9;&27.2#:%2*;5;8;>/678#<87'1#/.#2*%#5%1/>.#;&#%S%6B2/;.#;<#2*%#12B53#

# @2*%&#:/.;&#:%2*;5;8;>/678#<87'1#/.#2*%#5%1/>.#;&#%S%6B2/;.#;<#2*%#12B53#

@2*%&#/:9;&27.2#:%2*;5;8;>/678#<87'1#/.#2*%#5%1/>.#;&#%S%6B2/;.#;<#2*%#12B53#

Statistical methods # # # # # #EE# <;&#!$$L#J71#2*%#P27.57&5#=&&;&#;<#4%71B&%:%.2#TP=4W(#P:788%12#+%2%627K8%#

!*7.>%#TP+!W#;&#Z/:/21#;<#H>&%%:%.2#TZ;HW#6786B872%5b#P=4(#P+!(#;&#Z;H#6786B872%5#

);11/K8%#2;#6786B872%#Z;H#<&;:#2*%#5727#9&%1%.2%5#

# P=4#6786B872%5#K71%5#;.#!&;.K76*V1#789*7(#;&#;.#P+#<&;:#7.;2*% 9B872/;.#

## Box D. Content validity (including face validity)#

excellent good fair poor General requirements

E# J71#2*%&%#7.#711%11:%.2#;<#'*%2*%&#788#/2%:1#&%<%#&%8%07.2#719%621#;<#2*%#6;.12&B62#2;#K%#:%71B&%5b#

H11%11%5#/<#788#/2%:1#&%<%#&%8%07.2#719%621#;<#2*%#6;.12&B62#2;#K%#:%71B&%5#

# H19%621#;<#2*%#6;.12&B62#2;#K%#:%71B&%5#9;;&83#5%16&/K%5#HI+#2*/1#'71#.;2#27Q%.#/.2;#6;.1/5%&72/;.##

I@$#711%11%5#/<#788#/2%:1#&%<%#&%8%07.2#719%621#;<#2*%#6;.12&B62#2;#K%#:%71B&%5#

163

Maja

N/A

Maja

Maja

Maja

Maja

Maja

^# J71#2*%&%#7.#711%11:%.2#;<#'*%2*%&#788#/2%:1#7&%#&%8%07.2#<;&#2*%#12B53#9;9B872/;.b#T%M>M#7>%(#>%.5%&(#5/1%71%#6*7&762%&/12/61(#6;B.2&3(#1%22/.>W#

H11%11%5#/<#788#/2%:1#7&%#&%8%07.2#<;&#2*%#12B53#9;9B872/;.#/.#75%RB72%#17:98%#1/Y%#T EFW#

H11%11%5#/<#788#/2%:1#7&%#&%8%07.2#<;&#2*%#12B53#9;9B872/;.#/.#:;5%&72%#17:98%#1/Y%#TfNgW#

H11%11%5#/<#788#/2%:1#7&%#&%8%07.2#<;&#2*%#12B53#9;9B872/;.#/.#1:788#17:98%#1/Y%#ThfW#

I@$#711%11%5#/<#788#/2%:1#7&%#&%8%07.2#<;&#2*%#12B53#9;9B872/;.#@D#27&>%2#9;9B872/;.#.;2#/.0;80%5##

d# J71#2*%&%#7.#711%11:%.2#;<#'*%2*%&#788#/2%:1#7&%#&%8%07.2#<;&#2*%#9B&9;1%#;<#2*%#:%71B&%:%.2#/.12&B:%.2b#T5/16&/:/.72/0%(#%078B72/0%(#7.5m;&#9&%5/62/0%W#

H11%11%5#/<#788#/2%:1#7&%#&%8%07.2#<;&#2*%#9B&9;1%#;<#2*%#7998/672/;.#

)B&9;1%#;<#2*%#/.12&B:%.2#'71#.;2#5%16&/K%5#KB2#711B:%5#

I@$#711%11%5#/<#788#/2%:1#7&%#&%8%07.2#<;&#2*%#9B&9;1%#;<#2*%#7998/672/;.##

#

e# J71#2*%&%#7.#711%11:%.2#;<#'*%2*%&#788#/2%:1#2;>%2*%:9&%*%.1/0%83#&%<8%62#2*%#6;.12&B62#2;#K%#:%71B&%5b#

H11%11%5#/<#788#/2%:1#2;>%2*%:9&%*%.1/0%83#&%<8%62#2*%#6;.12&B62#2;#K%#:%71B&%5#

# I;#2*%;&%2/678#<;B.572/;.#;<#2*%#6;.12&B62#7.5#2*/1#'71#.;2#27Q%.#/.2;#6;.1/5%&72/;.#

I@$#711%11%5#/<#788#/2%:1#2;>%2*%:9&%*%.N1/0%83#&%<8%62#2*%#6;.12&B62#2;#K%#:%71B&%5###

f# J%&%#2*%&%#7.3#/:9;&27.2#<87'1#/.#2*%#5%1/>.#;&#:%2*;51#;<#2*%#12B53b# I;#;2*%&#/:9;&27.2#:%2*;5;8;>/678#<87'1#/.#2*%#5%1/>.#;&#%S%6B2/;.#;<#2*%#12B53#

# @2*%&#:/.;&#:%2*;5;8;>/678#<87'1#/.#2*%#5%1/>.#;&#%S%6B2/;.#;<#2*%#12B53#

@2*%&#/:9;&27.2#:%2*;5;8;>/678#<87'1#/.#2*%#5%1/>.#;&#%S%6B2/;.#;<#2*%#12B53#

###

164

Maja

Box E. Structural validity# excellent good fair poor E# +;%1#2*%#1678%#6;.1/12#;<#%<<%62#/.5/672;&1(#/M%M#/1#/2#K71%5#;.#7#&%<8%62/0%#:;5%8b#

#Design requirements # # # # # #^# J71#2*%#9%&6%.27>%#;<#:/11/.>#/2%:1#>/0%.b# )%&6%.27>%#;<#

:/11/.>#/2%:1#5%16&/K%5#

)%&6%.27>%#;<#:/11/.>#/2%:1#I@$#5%16&/K%5##

# #

d# J71#2*%&%#7#5%16&/92/;.#;<#*;'#:/11/.>#/2%:1#'%&%#*7.58%5b# +%16&/K%5#*;'#:/11/.>#/2%:1#'%&%#*7.58%5#

I;2#5%16&/K%5#KB2#/2#67.#K%#5%5B6%5#*;'#:/11/.>#/2%:1#'%&%#*7.58%5##

I;2#68%7&#*;'#:/11/.>#/2%:1#'%&%#*7.58%5#

#

e# J71#2*%#17:98%#1/Y%#/.68B5%5#/.#2*%#7.7831/1#75%RB72%b# ij#k/2%:1#7.5#EFF##

fj#k/2%:1#7.5#EFF#@D#fNij#k/2%:1#KB2#hEFF##

fj#k/2%:1#KB2#hEFF#

hfj#k/2%:1#

f# J%&%#2*%&%#7.3#/:9;&27.2#<87'1#/.#2*%#5%1/>.#;&#:%2*;51#;<#2*%#12B53b# I;#;2*%&#/:9;&27.2#:%2*;5;8;>/678#<87'1#/.#2*%#5%1/>.#;&#%S%6B2/;.#;<#2*%#12B53#

# @2*%&#:/.;&#:%2*;5;8;>/678#<87'1#/.#2*%#5%1/>.#;&#%S%6B2/;.#;<#2*%#12B53#T%M>M#&;272/;.#:%2*;5#.;2#5%16&/K%5W#

@2*%&#/:9;&27.2#:%2*;5;8;>/678#<87'1#/.#2*%#5%1/>.#;&#%S%6B2/;.#;<#2*%#12B53#T%M>M#/.799&;9&/72%#&;272/;.#:%2*;5W#

165

Maja

effect and causal indicators

Maja

Maja

Maja

Maja

Statistical methods # # # # # #_# <;&#!$$L#J71#%S98;&72;&3#;.</&:72;&3#<762;&#7.7831/1#9%&<;&:%5b# =S98;&72;&3#;&#

6;.</&:72;&3#<762;&#7.7831/1#9%&<;&:%5#7.5#239%#;<#<762;&#7.7831/1#799&;9&/72%#/.#0/%'#;<#%S/12/.>#/.<;&:72/;.##

=S98;&72;&3#<762;&#7.7831/1#9%&<;&:%5#'*/8%#6;.</&:72;&3#';B85#*70%#K%%.#:;&%#799&;9&/72%#

# I;#%S98;&72;&3#;.</&:72;&3#<762;&#7.7831/1#9%&<;&:%5#

i# <;&#AD$L#J%&%#AD$#2%121#<;&#5%2%&:/./.>#2*%#TB./NW#5/:%.1/;.78/23#;<#2*%#/2%:1#

9%&<;&:%5b#

AD$#2%12#<;&#5%2%&:/./.>#TB./W5/:%.1/;.N78/23#9%&<;&:%5#

# # AD$#2%12#<;&#5%2%&:/./.>#TB./W5/:%.1/;.N78/23#I@$#9%&<;&:%5#

###Box F. Hypotheses testing#

excellent good fair Poor Design requirements

E# J71#2*%#9%&6%.27>%#;<#:/11/.>#/2%:1#>/0%.b# )%&6%.27>%#;<#:/11/.>#/2%:1#5%16&/K%5#

)%&6%.27>%#;<#:/11/.>#/2%:1#I@$#5%16&/K%5##

# #

^# J71#2*%&%#7#5%16&/92/;.#;<#*;'#:/11/.>#/2%:1#'%&%#*7.58%5b# +%16&/K%5#*;'#:/11/.>#/2%:1#'%&%#*7.58%5#

I;2#5%16&/K%5#KB2#/2#67.#K%#5%5B6%5#*;'#:/11/.>#/2%:1#'%&%#*7.58%5##

I;2#68%7&#*;'#:/11/.>#/2%:1#'%&%#*7.58%5#

#

d# J71#2*%#17:98%#1/Y%#/.68B5%5#/.#2*%#7.7831/1#75%RB72%b# H5%RB72%#17:98%#1/Y%#T EFF#9%&#7.7831/1W#

?;;5#17:98%#1/Y%#TfFNgg#9%&#7.7831/1W#

4;5%&72%#17:98%#1/Y%#TdFNeg#9%&#7.7831/1W##

P:788#17:98%#1/Y%#ThdF#9%&#7.7831/1W#

166

Maja

--------------

Maja

Maja

Maja

N/A

Maja

Maja

Maja

Maja

Maja

Maja

e# J%&%#*39;2*%1%1#&%>7&5/.>#6;&&%872/;.1#;&#:%7.#5/<<%&%.6%1#<;&:B872%5#7#9&/;&/#T/M%M#K%<;&%#5727#6;88%62/;.Wb#

4B82/98%#*39;2*%1%1#<;&:B872%5#7#9&/;&/#

4/./:78#.B:K%&#;<#*39;2*%1%1#<;&:B872%#7#9&/;&/#

C39;2*%1%1#07>B%#;&#.;2#<;&:B872%5#KB2#9;11/K8%#2;#5%5B6%#'*72#'71#%S9%62%5#

-.68%7&#'*72#'71#%S9%62%5#

# # # #f# J71#2*%#%S9%62%5#direction#;<#6;&&%872/;.1#;&#:%7.#5/<<%&%.6%1#/.68B5%5#/.#2*%#

*39;2*%1%1b#=S9%62%5#5/&%62/;.#;<#2*%#6;&&%872/;.1#;&#5/<<%&%.6%1#1272%5#

=S9%62%5#5/&%62/;.#;<#2*%#6;&&%872/;.1#;&#5/<<%&%.6%1#I@$#1272%5##

# #

_# J71#2*%#%S9%62%5#7K1;8B2%#;&#&%872/0%#magnitude#;<#6;&&%872/;.1#;&#:%7.#5/<<%&%.6%1#/.68B5%5#/.#2*%#*39;2*%1%1b#

=S9%62%5#:7>./2B5%#;<#2*%#6;&&%872/;.1#;&#5/<<%&%.6%1#1272%5#

=S9%62%5#:7>./2B5%#;<#2*%#6;&&%872/;.1#;&#5/<<%&%.6%1#I@$#1272%5##

# #

i# <;.0%&>%.2#078/5/23L#J71#7.#75%RB72%#5%16&/92/;.#9&;0/5%5#;<#2*%#6;:97&72;&#/.12&B:%.2T1Wb#

H5%RB72%#5%16&/92/;.#;<#2*%#6;.12&B621#:%71B&%5#K3#2*%#6;:97&72;&#/.12&B:%.2T1W#

H5%RB72%#5%16&/92/;.#;<#:;12#;<#2*%#6;.12&B621#:%71B&%5#K3#2*%#6;:97&72;&#/.12&B:%.2T1W##

);;&#5%16&/92/;.#;<#2*%#6;.12&B621#:%71B&%5#K3#2*%#6;:97&72;&#/.12&B:%.2T1W#

I@#5%16&/92/;.#;<#2*%#6;.12&B621#:%71B&%5#K3#2*%#6;:97&72;&#/.12&B:%.2T1W#

G# <;.0%&>%.2#078/5/23L#J%&%#2*%#:%71B&%:%.2#9&;9%&2/%1#;<#2*%#6;:97&72;&#/.12&B:%.2T1W#75%RB72%83#5%16&/K%5b#

H5%RB72%#:%71B&%:%.2#9&;9%&2/%1#;<#2*%#6;:97&72;&#/.12&B:%.2T1W#/.#7#9;9B872/;.#1/:/87#2*%#12B53#9;9B872/;.#

H5%RB72%#:%71B&%:%.2#9&;9%&2/%1#;<#2*%#6;:97&72;&#/.12&B:%.2T1W#KB2#.;2#1B&%#/<#2*%1%#79983#2;#2*%#12B53#9;9B872/;.#

P;:%#/.<;&:72/;.#;.#:%71B&%:%.2#9&;9%&2/%1#T;&#7#&%<%&%.6%#2;#7#12B53#;.#:%71B&%:%.2#9&;9%&2/%1W#;<#2*%#6;:97&72;&#/.12&B:%.2T1W#/.#7.3#12B53#9;9B872/;.##

I;#/.<;&:72/;.#;.#2*%#:%71B&%:%.2#9&;9%&2/%1#;<#2*%#6;:97&72;&#/.12&B:%.2T1W#

167

Maja

Maja

Maja

Maja

Maja

Maja

Maja

g# J%&%#2*%&%#7.3#/:9;&27.2#<87'1#/.#2*%#5%1/>.#;&#:%2*;51#;<#2*%#12B53b# I;#;2*%&#/:9;&27.2#:%2*;5;8;>/678#<87'1#/.#2*%#5%1/>.#;&#%S%6B2/;.#;<#2*%#12B53#

# @2*%&#:/.;&#:%2*;5;8;>/678#<87'1#/.#2*%#5%1/>.#;&#%S%6B2/;.#;<#2*%#12B53#T%M>M#;.83#5727#9&%1%.2%5#;.#7#6;:97&/1;.#'/2*#7.#/.12&B:%.2#2*72#:%71B&%1#7.;2*%.12&B62W#

@2*%&#/:9;&27.2#:%2*;5;8;>/678#<87'1#/.#2*%#5%1/>.#;&#%S%6B2/;.#;<#2*%#12B53#

Statistical methods # # # # # #EF# J%&%#5%1/>.#7.5#1272/12/678#:%2*;51#75%RB72%#<;&#2*%#*39;2*%1%1#2;#K%#2%12%5b# P272/12/678#

:%2*;51#7998/%5#799&;9&/72%#

H11B:7K8%#2*72#1272/12/678#:%2*;51#'%&%#799&;9&/72%(#%M>M#)%7&1;.#6;&&%872/;.1#7998/%5(#KB2#5/12&/KB2/;.#;<#16;&%1#;&#:%7.#TP+W#.;2#9&%1%.2%5#

P272/12/678#:%2*;51#7998/%5#I@$#;92/:78#

P272/12/678#:%2*;51#7998/%5#I@$#799&;9&/72%#

####Box G. Cross-cultural validity#


E# J71#2*%#9%&6%.27>%#;<#:/11/.>#/2%:1#>/0%.b# )%&6%.27>%#;<#:/11/.>#/2%:1#5%16&/K%5#

)%&6%.27>%#;<#:/11/.>#/2%:1#I@$#5%16&/K%5##

# #

^# J71#2*%&%#7#5%16&/92/;.#;<#*;'#:/11/.>#/2%:1#'%&%#*7.58%5b# +%16&/K%5#*;'#:/11/.>#/2%:1#'%&%#*7.58%5#

I;2#5%16&/K%5#KB2#/2#67.#K%#5%5B6%5#*;'#:/11/.>#/2%:1#'%&%#*7.58%5##

I;2#68%7&#*;'#:/11/.>#/2%:1#'%&%#*7.58%5#

#

168

Maja

N/A

Maja

Maja

Maja

d# J71#2*%#17:98%#1/Y%#/.68B5%5#/.#2*%#7.7831/1#75%RB72%b# !$$L#ij#k/2%:1#7.5# EFF#AD$L# ^FF#9%&#>&;B9##

!$$L#fj#k/2%:1#7.5# EFF#@D#fNij#k/2%:1#KB2#hEFF#AD$L# ^FF#/.#E#>&;B9#7.5#EFFNEgg#/.#E#>&;B9#

!$$L#fj#k/2%:1#KB2#hEFF#AD$L#EFFNEgg#9%&#>&;B9#

!$$L#hfj#k/2%:1#AD$L#ThEFF#/.#E#;&#K;2*#>&;B91#

e# J%&%#K;2*#2*%#;&/>/.78#87.>B7>%#/.#'*/6*#2*%#CDN)D@#/.12&B:%.2#'71#5%0%8;9%5(#7.5#2*%#87.>B7>%#/.#'*/6*#2*%#CDN)D@#/.12&B:%.2#'71#2&7.1872%5#5%16&/K%5b#

";2*#1;B&6%#87.>B7>%#7.5#27&>%2#87.>B7>%#5%16&/K%5##

# # P;B&6%#87.>B7>%#I@$#Q.;'.#

f# J71#2*%#%S9%&2/1%#;<#2*%#9%;98%#/.0;80%5#/.#2*%#2&7.1872/;.#9&;6%11#75%RB72%83#5%16&/K%5b#%M>M#%S9%&2/1%#/.#2*%#5/1%71%T1W#/.0;80%5(#%S9%&2/1%#/.#2*%#6;.12&B62#2;#K%#:%71B&%5(#%S9%&2/1%#/.#K;2*#87.>B7>%1#

=S9%&2/1%#;<#2*%#2&7.1872;&1#5%16&/K%5#'/2*#&%19%62#2;#5/1%71%(#6;.12&B62(#7.5#87.>B7>%##

=S9%&2/1%#;<#2*%#2&7.1872;&1#'/2*#&%19%62#2;#5/1%71%#;.12&B62#9;;&#;&#.;2#5%16&/K%5#

=S9%&2/1%#;<#2*%#2&7.1872;&1#'/2*#&%19%62#2;#87.>B7>%#.;2#5%16&/K%5#

#

_# +/5#2*%#2&7.1872;&1#';&Q#/.5%9%.5%.283#<&;:#%76*#;2*%&b# $&7.1872;&1#';&Q%5#/.5%9%.5%.2#

H11B:7K8%#2*72#2*%#2&7.1872;&1#';&Q%5#/.5%9%.5%.2##

-.68%7&#'*%2*%&#2&7.1872;&1#';&Q%5#/.5%9%.5%.2#

$&7.1872;&1#';&Q%5#I@$#/.5%9%.5%.2#

i# J%&%#/2%:1#2&7.1872%5#<;&'7&5#7.5#K76Q'7&5b# 4B82/98%#<;&'7&5#7.5#:B82/98%#K76Q'7&5#2&7.1872/;.1##

4B82/98%#<;&'7&5#2&7.1872/;.1#KB2#;.%#K76Q'7&5#2&7.1872/;.##

@.%#<;&'7&5#7.5#;.%#K76Q'7&5#2&7.1872/;.#

@.83#7#<;&'7&5#2&7.1872/;.#

G# J71#2*%&%#7.#75%RB72%#5%16&/92/;.#;<#*;'#5/<<%&%.6%1#K%2'%%.#2*%#;&/>/.78#7.5#2&7.1872%5#0%&1/;.1#'%&%#&%1;80%5b#

H5%RB72%#5%16&/92/;.#;<#*;'#5/<<%&%.6%1#K%2'%%.#2&7.1872;&1#'%&%#&%1;80%5##

);;&83#;&#I@$#5%16&/K%5#*;'#5/<<%&%.6%1#K%2'%%.#2&7.1872;&1#'%&%#&%1;80%5#

# #

169

Maja

g# J71#2*%#2&7.1872/;.#&%0/%'%5#K3#7#6;::/22%%#T%M>M#;&/>/.78#5%0%8;9%&1Wb# $&7.1872/;.#&%0/%'%5#K3#7#6;::/22%%#T/.0;80/.>#;2*%&#9%;98%#2*7.#2*%#2&7.1872;&1(#%M>M#2*%#;&/>/.78#5%0%8;9%&1W##

$&7.1872/;.#I@$#&%0/%'%5#K3#T1B6*W#7#6;::/22%%#

# #

EF# J71#2*%#CDN)D@#/.12&B:%.2#9&%N2%12%5#T%M>M#6;>./2/0%#/.2%&0/%'1W#2;#6*%6Q#/.2%&9&%272/;.(#6B82B&78#&%8%07.6%#;<#2*%#2&7.1872/;.(#7.5#%71%#;<#6;:9&%*%.1/;.b#

$&7.1872%5#/.12&B:%.2#9&%N2%12%5#/.#2*%#27&>%2#9;9B872/;.#

$&7.1872%5#/.12&B:%.2#9&%N2%12%5(#KB2#B.68%7&#/<#2*/1#'71#5;.%#/.#2*%#27&>%2#9;9B872/;.##

$&7.1872%5#/.12&B:%.2#9&%N2%12%5(#KB2#I@$#/.#2*%#27&>%2#9;9B872/;.#

$&7.1872%5#/.12&B:%.2#I@$#9&%N2%12%5#

EE# J71#2*%#17:98%#B1%5#/.#2*%#9&%N2%12#75%RB72%83#5%16&/K%5b# P7:98%#B1%5#/.#2*%#9&%N2%12#75%RB72%83#5%16&/K%5##

# P7:98%#B1%5#/.#2*%#9&%N2%12#I@$#T75%RB72%83W#5%16&/K%5#

#

E^# J%&%#2*%#17:98%1#1/:/87&#<;&#788#6*7&762%&/12/61#%S6%92#87.>B7>%#7.5m;&#6B82B&78#K76Q>&;B.5b#

P*;'.#2*72#17:98%1#'%&%#1/:/87&#<;&#788#6*7&762%&/12/61#%S6%92#87.>B7>%#m6B82B&%#

P272%5#TKB2#.;2#1*;'.W#2*72#17:98%1#'%&%#1/:/87&#<;&#788#6*7&762%&/12/61#%S6%92#87.>B7>%#m6B82B&%##

-.68%7&#'*%2*%&#17:98%1#'%&%#1/:/87&#<;&#788#6*7&762%&/12/61#%S6%92#87.>B7>%#m6B82B&%##

P7:98%1#'%&%#I@$#1/:/87&#<;&#788#6*7&762%&/12/61#%S6%92#87.>B7>%#m6B82B&%##

Ed# J%&%#2*%&%#7.3#/:9;&27.2#<87'1#/.#2*%#5%1/>.#;&#:%2*;51#;<#2*%#12B53b# I;#;2*%&#/:9;&27.2#:%2*;5;8;>/678#<87'1#/.#2*%#5%1/>.#;&#%S%6B2/;.#;<#2*%#12B53#

# @2*%&#:/.;&#:%2*;5;8;>/678#<87'1#/.#2*%#5%1/>.#;&#%S%6B2/;.#;<#2*%#12B53#

@2*%&#/:9;&27.2#:%2*;5;8;>/678#<87'1#/.#2*%#5%1/>.#;&#%S%6B2/;.#;<#2*%#12B53#

170

Maja

Statistical methods # # # # # #Ee# <;&#!$$L#J71#6;.</&:72;&3#<762;&#7.7831/1#9%&<;&:%5b# 4B82/98%N>&;B9#

6;.</&:72;&3#<762;&#7.7831/1#9%&<;&:%5##

# # 4B82/98%N>&;B9#6;.</&:72;&3#<762;&#7.7831/1#I@$#9%&<;&:%5#

Ef# <;&#AD$L#J71#5/<<%&%.2/78#/2%:#<B.62/;.#T+AXW#K%2'%%.#87.>B7>%#>&;B91#711%11%5b# +AX#K%2'%%.#87.>B7>%#>&;B91#711%11%5#

# # +AX#K%2'%%.#87.>B7>%#>&;B91#I@$#711%11%5#

##Box H. Criterion validity#


E# J71#2*%#9%&6%.27>%#;<#:/11/.>#/2%:1#>/0%.b# )%&6%.27>%#;<#:/11/.>#/2%:1#5%16&/K%5#

)%&6%.27>%#;<#:/11/.>#/2%:1#I@$#5%16&/K%5##

# #

^# J71#2*%&%#7#5%16&/92/;.#;<#*;'#:/11/.>#/2%:1#'%&%#*7.58%5b# +%16&/K%5#*;'#:/11/.>#/2%:1#'%&%#*7.58%5#

I;2#5%16&/K%5#KB2#/2#67.#K%#5%5B6%5#*;'#:/11/.>#/2%:1#'%&%#*7.58%5##

I;2#68%7&#*;'#:/11/.>#/2%:1#'%&%#*7.58%5#

#


?;;5#17:98%#1/Y%#TfFNggW#

4;5%&72%#17:98%#1/Y%#TdFNegW##

P:788#17:98%#1/Y%#ThdFW#

e# !7.#2*%#6&/2%&/;.#B1%5#;&#%:98;3%5#K%#6;.1/5%&%5#71#7#&%71;.7K8%#U>;85#127.57&5Vb# !&/2%&/;.#B1%5#67.#K%#6;.1/5%&%5#7.#75%RB72%#U>;85#127.57&5V#T%0/5%.6%#9&;0/5%5W#

I;#%0/5%.6%#9&;0/5%5(#KB2#711B:7K8%#2*72#2*%#6&/2%&/;.#B1%5#67.#K%#6;.1/5%&%5#7.#75%RB72%#U>;85#127.57&5V##

-.68%7&#'*%2*%&#2*%#6&/2%&/;.#B1%5#67.#K%#6;.1/5%&%5#7.#75%RB72%#U>;85#127.57&5V#

!&/2%&/;.#B1%5#67.#I@$#K%#6;.1/5%&%5#7.#75%RB72%#U>;85#127.57&5V#

171

Maja

Maja

N/A

Maja

Maja

f# J%&%#2*%&%#7.3#/:9;&27.2#<87'1#/.#2*%#5%1/>.#;&#:%2*;51#;<#2*%#12B53b# I;#;2*%&#/:9;&27.2#:%2*;5;8;>/678#<87'1#/.#2*%#5%1/>.#;&#%S%6B2/;.#;<#2*%#12B53#

# @2*%&#:/.;&#:%2*;5;8;>/678#<87'1#/.#2*%#5%1/>.#;&#%S%6B2/;.#;<#2*%#12B53#

@2*%&#/:9;&27.2#:%2*;5;8;>/678#<87'1#/.#2*%#5%1/>.#;&#%S%6B2/;.#;<#2*%#12B53#

Statistical methods # # # # # #_# <;.2/.B;B1#16;&%1L#J%&%#6;&&%872/;.1(#;&#2*%#7&%7#B.5%&#2*%#&%6%/0%&#;9%&72/.>#

6B&0%#6786B872%5b#!;&&%872/;.1#;&#H-!#6786B872%5#

# # !;&&%872/;.1#;&#H-!#I@$#6786B872%5##

i# <;&#5/6*;2;:;B1#16;&%1L#J%&%#1%.1/2/0/23#7.5#19%6/</6/23#5%2%&:/.%5b# P%.1/2/0/23#7.5#19%6/</6/23#6786B872%5#

# # P%.1/2/0/23#7.5#19%6/</6/23#I@$#6786B872%5#

##Box I. Responsiveness# excellent good fair poor Design requirements E# J71#2*%#9%&6%.27>%#;<#:/11/.>#/2%:1#>/0%.b# )%&6%.27>%#;<#

:/11/.>#/2%:1#5%16&/K%5#

)%&6%.27>%#;<#:/11/.>#/2%:1#I@$#5%16&/K%5##

# #

^# J71#2*%&%#7#5%16&/92/;.#;<#*;'#:/11/.>#/2%:1#'%&%#*7.58%5b# +%16&/K%5#*;'#:/11/.>#/2%:1#'%&%#*7.58%5#

I;2#5%16&/K%5#KB2#/2#67.#K%#5%5B6%5#*;'#:/11/.>#/2%:1#'%&%#*7.58%5##

I;2#68%7&#*;'#:/11/.>#/2%:1#'%&%#*7.58%5#

#


?;;5#17:98%#1/Y%#TfFNggW#

4;5%&72%#17:98%#1/Y%#TdFNegW#

P:788#17:98%#1/Y%#ThdFW##

e# J71#7#8;.>/2B5/.78#5%1/>.#'/2*#72#8%712#2';#:%71B&%:%.2#B1%5b# Z;.>/2B5/.78#5%1/>.#B1%5#

# # I;#8;.>/2B5/.78#5%1/>.#B1%5##

f# J71#2*%#2/:%#/.2%&078#1272%5b# $/:%#/.2%&078#75%RB72%83#5%16&/K%5##

# # $/:%#/.2%&078#I@$#5%16&/K%5#

172

Maja

Maja

Maja

Maja

Maja

Maja

_# A<#7.32*/.>#;66B&&%5#/.#2*%#/.2%&/:#9%&/;5#T%M>M#/.2%&0%.2/;.(#;2*%&#&%8%07.2#%0%.21W(#'71#/2#75%RB72%83#5%16&/K%5b#

H.32*/.>#2*72#;66B&&%5#5B&/.>#2*%#/.2%&/:#9%&/;5#T%M>M#2&%72:%.2W#75%RB72%83#5%16&/K%5##

H11B:7K8%#'*72#;66B&&%5#5B&/.>#2*%#/.2%&/:#9%&/;5#

-.68%7&#;&#I@$#5%16&/K%5#'*72#;66B&&%5#5B&/.>#2*%#/.2%&/:#9%&/;5#

#

i# J71#7#9&;9;&2/;.#;<#2*%#972/%.21#6*7.>%5#T/M%M#/:9&;0%:%.2#;&#5%2%&/;&72/;.Wb# )7&2#;<#2*%#972/%.21#'%&%#6*7.>%5#T%0/5%.6%#9&;0/5%5W##

I@#%0/5%.6%#9&;0/5%5(#KB2#711B:7K8%#2*72#97&2#;<#2*%#972/%.21#'%&%#6*7.>%5##

-.68%7&#/<#97&2#;<#2*%#972/%.21#'%&%#6*7.>%5##

)72/%.21#'%&%#I@$#6*7.>%5##

Design requirements for hypotheses testing # # # # # ## X;.12&B621#<;&#'*/6*#7#>;85#127.57&5#'71#.;2#707/87K8%L#

## # # #

G# J%&%#*39;2*%1%1#7K;B2#6*7.>%1#/.#16;&%1#<;&:B872%5#7#9&/;&/#T/M%M#K%<;&%#5727#6;88%62/;.Wb#

C39;2*%1%1#<;&:B872%5#7#9&/;&/#

# C39;2*%1%1#07>B%#;&#.;2#<;&:B872%5#KB2#9;11/K8%#2;#5%5B6%#'*72#'71#%S9%62%5#

-.68%7&#'*72#'71#%S9%62%5#

# # # #g# J71#2*%#%S9%62%5#direction#;<#6;&&%872/;.1#;&#:%7.#5/<<%&%.6%1#;<#2*%#6*7.>%#

16;&%1#;<#CDN)D@#/.12&B:%.21#/.68B5%5#/.#2*%1%#*39;2*%1%1b#=S9%62%5#5/&%62/;.#;<#2*%#6;&&%872/;.1#;&#5/<<%&%.6%1#1272%5#

=S9%62%5#5/&%62/;.#;<#2*%#6;&&%872/;.1#;&#5/<<%&%.6%1#I@$#1272%5##

# #

EF# J%&%#2*%#%S9%62%5#7K1;8B2%#;&#&%872/0%#magnitude#;<#6;&&%872/;.1#;&#:%7.#5/<<%&%.6%1#;<#2*%#6*7.>%#16;&%1#;<#CDN)D@#/.12&B:%.21#/.68B5%5#/.#2*%1%#*39;2*%1%1b#

=S9%62%5#:7>./2B5%#;<#2*%#6;&&%872/;.1#;&#5/<<%&%.6%1#1272%5#

=S9%62%5#:7>./2B5%#;<#2*%#6;&&%872/;.1#;&#5/<<%&%.6%1#I@$#1272%5##

# #

173

Maja

Maja

Maja

Maja

Maja

Maja

EE# J71#7.#75%RB72%#5%16&/92/;.#9&;0/5%5#;<#2*%#6;:97&72;&#/.12&B:%.2T1Wb# H5%RB72%#5%16&/92/;.#;<#2*%#6;.12&B621#:%71B&%5#K3#2*%#6;:97&72;&#/.12&B:%.2T1W##

# );;&#5%16&/92/;.#;<#2*%#6;.12&B621#:%71B&%5#K3#2*%#6;:97&72;&#/.12&B:%.2T1W#

I@#5%16&/92/;.#;<#2*%#6;.12&B621#:%71B&%5#K3#2*%#6;:97&72;&#/.12&B:%.2T1W#

E^# J%&%#2*%#:%71B&%:%.2#9&;9%&2/%1#;<#2*%#6;:97&72;&#/.12&B:%.2T1W#75%RB72%83#5%16&/K%5b#

H5%RB72%#:%71B&%:%.2#9&;9%&2/%1#;<#2*%#6;:97&72;&#/.12&B:%.2T1W#/.#7#9;9B872/;.#1/:/87#2*%#12B53#9;9B872/;.#

H5%RB72%#:%71B&%:%.2#9&;9%&2/%1#;<#2*%#6;:97&72;&#/.12&B:%.2T1W#KB2#.;2#1B&%#/<#2*%1%#79983#2;#2*%#12B53#9;9B872/;.#

P;:%#/.<;&:72/;.#;.#:%71B&%:%.2#9&;9%&2/%1#T;&#7#&%<%&%.6%#2;#7#12B53#;.#:%71B&%:%.2#9&;9%&2/%1W#;<#2*%#6;:97&72;&#/.12&B:%.2T1W#/.#7.3#12B53#9;9B872/;.##

I@#/.<;&:72/;.#;.#2*%#:%71B&%:%.2#9&;9%&2/%1#;<#2*%#6;:97&72;&#/.12&B:%.2T1W#

Ed# J%&%#2*%&%#7.3#/:9;&27.2#<87'1#/.#2*%#5%1/>.#;&#:%2*;51#;<#2*%#12B53b# I;#;2*%&#/:9;&27.2#:%2*;5;8;>/678#<87'1#/.#2*%#5%1/>.#;&#%S%6B2/;.#;<#2*%#12B53#

# @2*%&#:/.;&#:%2*;5;8;>/678#<87'1#/.#2*%#5%1/>.#;&#%S%6B2/;.#;<#2*%#12B53#T%M>M#;.83#5727#9&%1%.2%5#;.#7#6;:97&/1;.#'/2*#7.#/.12&B:%.2#2*72#:%71B&%1#7.;2*%.12&B62W#

@2*%&#/:9;&27.2#:%2*;5;8;>/678#<87'1#/.#2*%#5%1/>.#;&#%S%6B2/;.#;<#2*%#12B53#

Statistical methods # # # # # #Ee# J%&%#5%1/>.#7.5#1272/12/678#:%2*;51#75%RB72%#<;&#2*%#*39;2*%1%1#2;#K%#2%12%5b# P272/12/678#

:%2*;51#7998/%5#799&;9&/72%#

# P272/12/678#:%2*;51#7998/%5#I@$#;92/:78#

P272/12/678#:%2*;51#7998/%5#I@$#799&;9&/72%#

174

Maja

Maja

Maja

Maja

Design requirement for comparison to a gold standard # # # # # ## X;.12&B621#<;&#'*/6*#7#>;85#127.57&5#'71#707/87K8%L#

## # # #

Ef# !7.#2*%#6&/2%&/;.#<;&#6*7.>%#K%#6;.1/5%&%5#71#7#&%71;.7K8%#>;85#127.57&5b# !&/2%&/;.#B1%5#67.#K%#6;.1/5%&%5#7.#75%RB72%#U>;85#127.57&5V#T%0/5%.6%#9&;0/5%5W##

I;#%0/5%.6%#9&;0/5%5(#KB2#711B:7K8%#2*72#2*%#6&/2%&/;.#B1%5#67.#K%#6;.1/5%&%5#7.#75%RB72%#U>;85#127.57&5V##

-.68%7&#'*%2*%&#2*%#6&/2%&/;.#B1%5#67.#K%#6;.1/5%&%5#7.#75%RB72%#U>;85#127.57&5V#

!&/2%&/;.#B1%5#67.#I@$#K%#6;.1/5%&%5#7.#75%RB72%#U>;85#127.57&5V#

E_# J%&%#2*%&%#7.3#/:9;&27.2#<87'1#/.#2*%#5%1/>.#;&#:%2*;51#;<#2*%#12B53b# I;#;2*%&#/:9;&27.2#:%2*;5;8;>/678#<87'1#/.#2*%#5%1/>.#;&#%S%6B2/;.#;<#2*%#12B53#

# @2*%&#:/.;&#:%2*;5;8;>/678#<87'1#/.#2*%#5%1/>.#;&#%S%6B2/;.#;<#2*%#12B53#

@2*%&#/:9;&27.2#:%2*;5;8;>/678#<87'1#/.#2*%#5%1/>.#;&#%S%6B2/;.#;<#2*%#12B53#

Statistical methods # # # # # #Ei# <;.2/.B;B1#16;&%1L#J%&%#6;&&%872/;.1#K%2'%%.#6*7.>%#16;&%1(#;&#2*%#7&%7#B.5%&#

2*%#D%6%/0%&#@9%&72;&#!B&0%#TD@!W#6B&0%#6786B872%5b#!;&&%872/;.1#;&#H&%7#B.5%&#2*%#D@!#!B&0%#TH-!W#6786B872%5##

# # !;&&%872/;.1#;&#H-!#I@$#6786B872%5##

EG# <;&#5/6*;2;:;B1#1678%1L#J%&%#1%.1/2/0/23#7.5#19%6/</6/23#T6*7.>%5#0%&1B1#.;2#6*7.>%5W#5%2%&:/.%5b#

P%.1/2/0/23#7.5#19%6/</6/23#6786B872%5#

# # P%.1/2/0/23#7.5#19%6/</6/23#I@$#6786B872%5#

###

175

Maja

N/A

Maja

N/A

Maja

Maja

Maja

Interpretability##J%#&%6;::%.5#2;#B1%#2*%#A.2%&9&%27K/8/23#K;S#2;#%S2&762#788#/.<;&:72/;.#;.#2*%#/.2%&9&%27K/8/23#/11B%1#5%16&/K%5#/.#2*/1#K;S#;<#2*%#/.12&B:%.21#B.5%&#12B53#<&;:#2*%#/.68B5%5#7&2/68%1M####Box Interpretability

)%&6%.27>%#;<#:/11/.>#/2%:1## #

+%16&/92/;.#;<#*;'#:/11/.>#/2%:1#'%&%#*7.58%5# #

+/12&/KB2/;.#;<#2*%#T2;278W#16;&%1## #

)%&6%.27>%#;<#2*%#&%19;.5%.21#'*;#*75#2*%#8;'%12#9;11/K8%#T2;278W#16;&%# #

)%&6%.27>%#;<#2*%#&%19;.5%.21#'*;#*75#2*%#*/>*%12#9;11/K8%#T2;278W#16;&%# #

P6;&%1#7.5#6*7.>%#16;&%1#T/M%M#:%7.1#7.5#P+W#<;&#&%8%07.2#T1BKW#>&;B91(#%M>M#<;&#.;&:72/0%#

>&;B91(#1BK>&;B91#;<#972/%.21(#;&#2*%#>%.%&78#9;9B872/;.#

#

4/./:78#A:9;&27.2#!*7.>%#T4A!W#;&#4/./:78#A:9;&27.2#+/<<%&%.6%#T4A+W# #

#

176

Maja

sensitivity analyses (min/max values)

Maja

Maja

Maja

presented as graphs, mean(std), skewness

Maja

Maja

none; highest total WDQ = 119

Maja

Maja

not estimated; MDC=22 points

Maja

presented in responsiveness section for WDQ and subscales

Maja

Maja

Maja

Maja

none; lowest score = 2

Maja

few; 18.5-24% missed 1 of13 items

Generalizability #J%#&%6;::%.5#2;#B1%#2*%#?%.%&78/Y7K/8/23#K;S#2;#%S2&762#5727#;.#2*%#6*7&762%&/12/61#;<#2*%#12B53#9;9B872/;.1#7.5#17:98/.>#9&;6%5B&%1#;<#2*%#/.68B5%5#12B5/%1M##Box Generalisability # #4%5/7.#;&#:%7.#7>%#T'/2*#127.57&5#5%0/72/;.#;&#&7.>%W# #

+/12&/KB2/;.#;<#1%S# #

A:9;&27.2#5/1%71%#6*7&762%&/12/61#T%M>M#1%0%&/23(#1272B1(#5B&72/;.W#7.5#5%16&/92/;.#;<#2&%72:%.2# #

P%22/.>T1W#/.#'*/6*#2*%#12B53#'71#6;.5B62%5#T%M>M#>%.%&78#9;9B872/;.(#9&/:7&3#67&%#;&#

*;19/278m&%*7K/8/272/;.#67&%W#

#

!;B.2&/%1#/.#'*/6*#2*%#12B53#'71#6;.5B62%5# #

Z7.>B7>%#/.#'*/6*#2*%#CDN)D@#/.12&B:%.2#'71#%078B72%5# #

4%2*;5#B1%5#2;#1%8%62#972/%.21#T%M>M#6;.0%./%.6%(#6;.1%6B2/0%(#;&#&7.5;:W# #

)%&6%.27>%#;<#:/11/.>#&%19;.1%1#T&%19;.1%#&72%W# #

##

177

Maja

mean 42.1 (SD=13.2); 19.6-81.6

Maja

70% female (91/130)

Maja

presented in Table 1

Maja

hospital/rehab centres at University Health Network (1 urban; 2 suburban)

Maja

Canada (Ontario)

Maja

English

Maja

convenience cohort if assessed for RCT regardless of RCT enrolment

Maja

Maja

1 WDQ was not completed at baseline

Maja

Measurement Properties of the Whiplash Disability ... · Measurement Properties of the Whiplash...

Documents

Transcript of Measurement Properties of the Whiplash Disability ... · Measurement Properties of the Whiplash...