1 9/23/2015 Patient Reported Outcomes Measurement Information System (PROMIS) Patient Reported...

1 04/19/23

Patient Reported Outcomes Patient Reported Outcomes Measurement Measurement

Information System (PROMIS)Information System (PROMIS)National Institute of Arthritis and Musculoskeletal and Skin Diseases (NIAMS) U01 grant

Ron D. Hays, Ph.D., UCLA Ron D. Hays, Ph.D., UCLA

September 29, 2006 September 29, 2006

GIM/HSR Research Seminar SeriesGIM/HSR Research Seminar Series

12:02pm-12:59 pm12:02pm-12:59 pm

2 04/19/23

NIH Director Elias A. Zerhouni, MD

“There is a pressing need to better quantify clinically important symptoms and outcomes that are now difficult to measure. Clinical outcome measures, such as x-rays and lab tests, have minimally immediate relevance to the day-to-day functioning of patients with chronic diseases such as arthritis, multiple sclerosis, and asthma, as well as chronic pain conditions.

Often, the best way patients can judge the effectiveness of treatments is by perceived changes in symptoms. One main goal of the PROMIS initiative is to develop a set of publicly available computerized adaptive tests for the clinical research community.”

3 04/19/23

NIH RM04-011: Dynamic Assessment of Patient-Reported Chronic Disease Outcomes

Translation arm of re-engineering clinical research enterprise

Chronic diseases and their treatment affect “quality of life”

Improved assessment of “subjective” clinical outcomes

- Self-reported symptoms and other health-related quality of life domains

4 04/19/23

ADL – Activities of Daily LivingIADL – Instrumental Activities of Daily LivingG – Global Item

Satisfaction

Satisfaction

Satisfaction

Satisfaction

Performance

G Symptoms

G Activities: Instrumental Activities of Daily Living

[IADL] (e.g. errands)

Other

Social Support

G Anxiety

G Anger/Aggression

G Depression

G Fatigue

G Substance Abuse

Positive Psychological Functioning

Negative Impacts of Illness

Subjective Well-Being (positive affect)

Positive Impacts of Illness

Meaning and Coherence (spirituality)

G Emotional Distress

Mastery and Control (self-efficacy)

Cognitive Function

G Central: neck and back (twisting, bending, etc)

G Lower Extremities: walking, arising, etc [mobility]

G Upper Extremities: grip, buttons, etc [dexterity]

G Function/Disability

G Physical Health

G Mental Health

G Social Health

Satisfaction

G Health

G Pain

Patient-Reported Outcomes (PROs)Preliminary PROMIS Domains (light shade)

G Role Participation

1108054

5 04/19/23

Initial PROMIS Domains

Physical functioning (4)

Pain (3)

Fatigue (2)

Social/role participation (2)

Emotional distress

Anxiety

Depression

Anger

Alcohol abuse

6 04/19/23

Objectives

Develop and test a large bank of items measuring health-related quality of life

Create a publicly available, adaptable and sustainable system allowing clinical researchers access to a

* common item repository

* computer adaptive testing (CAT) platform

for efficient assessment across a range of chronic diseases

7 04/19/23

What would youlike to measure in your study?

Domain definitions

item content review

item characteristics

Click below for:

or select a box on the right and proceed

http://www.nihpromis.org

pain

emotional distress

physicalfunction

fatigue

social roleparticipation

8 04/19/23

Collaborative Agreement

Steering Committee

6 Primary Research Sites

Statistical Coordinating Center

Scientific Advisory Board

9 04/19/23

PROMIS Network

UNC –Chapel Hill ● UNC –Chapel Hill ●

● Duke University● Duke University

● Stanford University● Stanford University

Stony Brook University

●

Stony Brook University

●● University of Pittsburgh● University of Pittsburgh

● University of Washington● University of Washington

CORE ♥CORE ♥ ● NIHNIH● NIHNIH

10 04/19/23

Primary Research Sites

Duke (Kevin Weinfurt, evaluation committee, participation and data quality committee, use in clinical trials, cancer supplement)

Stanford (Jim Fries, physical function, domain hierarchy)

Stony Brook (Arthur Stone, fatigue, ecological momentary assessment)

UNC (Darren DeWalt, social/role participation, pediatrics, literacy)

University of Pittsburgh (Paul Pilkonis, emotional distress, sleep)

University of Washington (Dagmar Amtmann, pain, universal access)

11 04/19/23

Statistical Coordinating Center

Northwestern (David Cella and cast)

UCLA (Hays, Liu, Reise, Spritzer, Morales)

Other consultants (e.g., Dennis Revicki)

Westat

12 04/19/23

What PROMIS Promises

P recisionP recisionR epositoryR epository

Outcome toolsOutcome tools

M ethodologiesM ethodologies

I nterpretabilityI nterpretability

S oftwareS oftware

13 04/19/23

Precision

Fewer items needed for equal precision

Making assessment briefer

Error is understood at the individual level

Enabling individual assessment

14 04/19/23

Item Bank(IRT-calibrated items)

ItemResponseTheory(IRT) 0.0

0.5

1.0

1.5

2.0

2.5

-3 -2 -1 0 1 2 3

Theta

Info

rma

tio

n

0.0

0.2

0.4

0.6

0.8

1.0

-3 -2 -1 0 1 2 3

Theta

Pro

ba

bil

ity o

f R

esp

on

se

Short FormInstruments

CAT

Items fromInstrument

A

Item Pool


B


C

NewItems

SecondaryData Analysis

CognitiveTesting

FocusGroups

Content ExpertReview

Questionnaireadministered to largerepresentative sample

15 04/19/23

0%10%20%30%40%50%60%70%80%90%

100%

-4

-3.5 -3

-2.5 -2

-1.5 -1

-0.5 0

0.5 1

1.5 2

2.5 3

3.5 4

Level of Physical Functioning

Pro

ba

bil

ity

of

"Ye

s"

Re

sp

on

se

Item 1 (Difficulty = -1) Item 3 (Difficulty = 1)

Item Characteristic Curves (able to climb flight of stairs versus run a mile)

16 04/19/23

Item Characteristic Curves(2-Parameter Model)

17 04/19/23

Item Banks

no pain

mild pain

moderate pain

severe pain extreme pain

Pain Item Bank

Item1

Item2

Item3

Item4

Item5

Item6

Item7

Item8

Item9

Itemn

An item bank is comprised of a large collection of items measuring a single domain (e.g., pain).

18 04/19/23

Fatigue Measure and Standard Error Comparision by Test Length

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

-4 -3 -2 -1 0 1 2 3 4

Fatigue Measure

Sta

nd

ard

Err

or

5 Item CAT 10 Item CAT 72 Item Bank 6 Item SF 13 Item Scale

Fatigue Measure and Standard Fatigue Measure and Standard Error Comparison by Test Error Comparison by Test

LengthLength

19 04/19/23

Item Bank(IRT-calibrated emotional distress items)

-3 -2 -1 0 1 2 3

Severe high moderate low very low

Emotional Distress

0.0

0.2

0.4

0.6

0.8

1.0

-3.00 -2.00 -1.00 0.00 1.00 2.00 3.00

How often did you feel nervous?

All of the time

Most of the time

Little of the time

Some of the time

None of the time

20 04/19/23


-3 -2 -1 0 1 2 3


Emotional Distress

0.0

0.2

0.4

0.6

0.8

1.0

-3.00 -2.00 -1.00 0.00 1.00 2.00 3.00


Some of the time

21 04/19/23


-3 -2 -1 0 1 2 3


Emotional Distress

0.0

0.2

0.4

0.6

0.8

1.0

-3.00 -2.00 -1.00 0.00 1.00 2.00 3.00


Some of the time

22 04/19/23


-3 -2 -1 0 1 2 3


Emotional Distress

0.0

0.2

0.4

0.6

0.8

1.0

-3.00 -2.00 -1.00 0.00 1.00 2.00 3.00

How often did you feel hopeless?

All of the time

Most of the time

Little of the time

Some of the time

None of the time

23 04/19/23


-3 -2 -1 0 1 2 3


Emotional Distress

0.0

0.2

0.4

0.6

0.8

1.0

-3.00 -2.00 -1.00 0.00 1.00 2.00 3.00

How often did you feel hopeless?

Some of the time

24 04/19/23


-3 -2 -1 0 1 2 3


Emotional Distress

0.0

0.2

0.4

0.6

0.8

1.0

-3.00 -2.00 -1.00 0.00 1.00 2.00 3.00

How often did you feel worthless?

All of the time

Most of the time

Little of the time

Some of the time

None of the time

25 04/19/23


-3 -2 -1 0 1 2 3


Emotional Distress

0.0

0.2

0.4

0.6

0.8

1.0

-3.00 -2.00 -1.00 0.00 1.00 2.00 3.00


Little of the time

26 04/19/23


-3 -2 -1 0 1 2 3


0.0

0.2

0.4

0.6

0.8

1.0

-3.00 -2.00 -1.00 0.00 1.00 2.00 3.00


Little of the time

Target in onemotional

distress score

27 04/19/23

Repository

28 04/19/23

Repository: PROMIS Item Library

Literature searches and investigator contributions to the PROMIS domains

Relational database of more than 7,000 items

Catalog characteristics of items including

Context

Stem

Response options

Time frame

Instrument of origin (if applicable) Intellectual property status Track modifications to items

29 04/19/23

Outcome Tools

30 04/19/23

Item LibraryItem LibraryItem LibraryItem Library

CATCATCATCAT Build-a-PROBuild-a-PROBuild-a-PROBuild-a-PROPick-a-PROPick-a-PROPick-a-PROPick-a-PRO

31 04/19/23

Methodologies

32 04/19/23

Methodologies

Qualitative Item Review

Expert item review of 6,871 items

26 focus groups; over 120 patients interviewed; over 700 surveys

Analysis Plan

Classical test theory and IRT analyses

Sampling Plan

11,500 people; 951 items; minimum n = 500/item

33 04/19/23

Interpretation

34 04/19/23

LowLowLowLow HighHighHighHigh

PersonPerson Fatigue Score Fatigue ScorePersonPerson Fatigue Score Fatigue Score

Interpretation

Q Q QQ Q Q Q Q QQ Q Q Q Q QQ Q Q Q Q QQ Q Q Q Q Q Q Q Q LikelyLikely““I get tiredI get tiredwhen I runwhen I runa marathon”a marathon”

LikelyLikely““I get tiredI get tiredwhen I runwhen I runa marathon”a marathon”

UnlikelyUnlikely““I get tiredI get tiredwhen I getwhen I getout of a out of a chair” chair”

UnlikelyUnlikely““I get tiredI get tiredwhen I getwhen I getout of a out of a chair” chair” ItemItem Location LocationItemItem Location Location

Q Q QQ Q QQ Q QQ Q Q Q Q QQ Q Q

Q Q QQ Q Q Q Q QQ Q Q Q Q QQ Q QQ Q QQ Q Q

Q Q QQ Q QQ Q QQ Q QQ Q QQ Q Q

Q Q Q Q

Q Q Q Q

Q Q Q Q

35 04/19/23


PRO Bank PRO Bank PersonPerson Score ScorePRO Bank PRO Bank PersonPerson Score Score

Interpretation Aids

30 40 50 60 70

M = 50, SD = 10

T = (z * 10) + 50

36 04/19/23


Example of high fatigue

Fatigue Score=60Fatigue Score=60Fatigue Score=60Fatigue Score=60

30 40 50 60 70

This patient’s fatigue score is 60, significantly worse than average (50). People who score 60 on fatigue tend to answer questions as follows:

…”I have been too tired to climb one flight of stairs: VERY MUCH

…”I have had enough energy to go out with my family: A LITTLE BIT

Click here if you would like to see this patient’s individual answers

37 04/19/23


Example of low fatigue

Fatigue Score=40Fatigue Score=40Fatigue Score=40Fatigue Score=40

30 40 50 60 70

This patient’s fatigue score is 40, significantly better than average (50). People who

score 40 on fatigue tend to answer questions as follows:

…”I have been too tired to climb one flight of stairs: SOMEWHAT…”I have had enough energy to go out with my family: VERY MUCH

Click here if you would like to see this patient’s individual answers

38 04/19/23

Software

39 04/19/23

Software

Survey Construction Report Module

Clinical/PRO Data

Study Protocols

Coming 2007-2008 to www.nihpromis.org

Item Banks

Patients Clinicians

Item Library

Laptop PhonePDAWeb

Survey Module

Web Laptop PDA IVR

CAT CAT Pick-a-PROPick-a-PROBuild-a-PROBuild-a-PRO

Domains and itemDomains and item banks banks

40 04/19/23

Web-based administration: Emotional distress (Anger) item

41 04/19/23

Four possible administration formats

Automatic advance, not allowed to go back

Automatic advance, allowed to go back

Click to continue after response, not allowed to go back

Click to continue, allowed to go back

42 04/19/23

Evaluated administration format

806 participants in Polimetrix PollingPlace registry

56 items measuring the performance of social/role activities

Items rated on a 5-point frequency scale ranging from ”never” to “always”

56 items measuring satisfaction with social/role activities

Items rated on a 5-point extent scale ranging from “not at all” to “very much”

43 04/19/23

Analysis Plan

Examined differences in:

Time spent

Mean domain scores

Variance in scores

Reliability

44 04/19/23

Administration format conclusions

Use automatic advance rather than continue button

Use back button to guard against accidental key entry

Response time cost was minimal

No effect on missing data or scores

45 04/19/23

PROMIS Sampling

Administer a large number of items to a range of population subgroups (general population and clinical) to permit the estimation of item parameters for item banks in five health-related quality of life domains.

46 04/19/23

Number of items

1013 items

784 items: 14 item banks (56 items per bank) in 5 domains

167 legacy items

62 demographic Items

146-202 items administered to any one respondent

n = 11,500 overall (500 observations per item minimum)

47 04/19/23

Target Polimetrix General Population Demographics

GenderMale 50%

Female 50%

Age

18-29 = 20%

30-44 = 20%

45-59 = 20%

60-74 = 20%

75+ = 20%

Ethnicity (match the general population)

Black = 12.3%

Hispanic or Latino = 12.5%

Education Min 25% High school graduate or less

48 04/19/23

SamplesN

General population (Polimetrix) 7,500

Cancer (Duke, Polimetrix) 1000

Heart Disease (Duke, Polimetrix) 500

Rheumatoid arthritis (Stanford) 500

Osteoarthritis (Stanford, Polimetrix) 500

Psychiatric (Pittsburgh, Polimetrix) 500

Spinal Cord Injury (Polimetrix) 500

Chronic Obstructive Pulmonary Disease (Polimetrix)

500

TOTAL 11,500

49 04/19/23

Domain Sub-Domain Items Form Sample N

Emotional Distress

Anxiety 56A Gen. Pop. 1 500

Depression 56

Anger/Aggression 56B Gen. Pop. 2 500

Alcohol Abuse 56

Physical Function

Part I 56C Gen. Pop. 3 500

Part II 56

Part III 56 G Gen. Pop. 7 500

Fatigue Impact 56

D Gen. Pop. 4 500Experience 56

Social Role Impact 56

E Gen. Pop. 5 500Experience 56

Pain

Interference 56F Gen. Pop. 6 500

Quality 56

Behavior 56 G Gen. Pop. 7 (above)

Full Bank Administration

50 04/19/23

Item Number → 1 1 1 1 1 1 5 5 5

Sample Size ↓ 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 4 5 6

General Pop. 8 250 H H H H H H H

General Pop. 9 250 I I I I I I I

General Pop. 10 250 J J J J J J J

Heart Disease250 H H H H H H H

250 J J J J J J J

Cancer 250 I I I I I I I

250 K K K K K

Block Administration

51 04/19/23

Item Number → 1 1 1 1 1 1 5 5 5

Sample Size ↓ 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 4 5 6

General Pop. 8 250 H H H H H H H

General Pop. 9 250 I I I I I I I

General Pop. 10 250 J J J J J J J

Heart Disease250 H H H H H H H

250 J J J J J J J

Cancer 250 I I I I I I I

250 K K K K K

Between Banks Block Administration Strategy

Item 5 is administered in Block Forms H and I

to General Population Sample 8

as well as Heart Disease

and Cancer Samples

and General Population Sample 9

52 04/19/23

Types of AnalysesTypes of Analyses

• Classical Test Theory StatisticsClassical Test Theory Statistics

• IRT Model AssumptionsIRT Model Assumptions

• Model FitModel Fit

• Differential Item FunctioningDifferential Item Functioning

• Item CalibrationItem Calibration

53 04/19/23

Classical Test Theory StatisticsClassical Test Theory Statistics

• Out of rangeOut of range

• Item frequencies and distributionsItem frequencies and distributions

• Inter-item correlationsInter-item correlations

• Item-scale correlationsItem-scale correlations

• Internal consistency reliabilityInternal consistency reliability

54 04/19/23

IRT Model AssumptionsIRT Model Assumptions

• (Uni)dimensionality(Uni)dimensionality

• Local independenceLocal independence

• MonotonicityMonotonicity

55 04/19/23

Sufficient UnidimensionalitySufficient Unidimensionality

• Confirmatory factor modelsConfirmatory factor models

One factorOne factor

Bifactor (general and group factors)Bifactor (general and group factors)

56 04/19/23

Local IndependenceLocal Independence

• After controlling for dominant factor(s), item After controlling for dominant factor(s), item pairs should not be associated.pairs should not be associated.

Look at residual correlations (> 0.20)Look at residual correlations (> 0.20)

57 04/19/23

MonotonicityMonotonicity

• Probability of selecting a response category Probability of selecting a response category indicative of better health should increase as indicative of better health should increase as underlying health increases.underlying health increases.

• Item response function graphs withItem response function graphs with

y-axis: proportion positive for item stepy-axis: proportion positive for item step

x-axis: raw scale score minus item scorex-axis: raw scale score minus item score

58 04/19/23

59 04/19/23

Category Response Curves for Category Response Curves for Samejima’s Graded Response ModelSamejima’s Graded Response Model

0.0

0.2

0.4

0.6

0.8

1.0

-3 -2 -1 0 1 2 3

Theta (q)

P(X

=k|q

)

P (X = 3|q)P (X = 1|q)

P (X = 2|q)

P (X = 4|q)

60 04/19/23

Model FitModel Fit

• Compare observed and expected response Compare observed and expected response frequencies by item and response categoryfrequencies by item and response category

• Items that do not fit and less discriminating Items that do not fit and less discriminating items identified and reviewed by content items identified and reviewed by content expertsexperts

61 04/19/23

Differential Item FunctioningDifferential Item Functioning

• Uniform DIF Uniform DIF

Threshold parameterThreshold parameter

• Non-uniform DIF Non-uniform DIF

Discrimination parameter Discrimination parameter

• Gender, race/ethnicity, age, education, Gender, race/ethnicity, age, education, diseasedisease

62 04/19/23

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

-4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4

Trait level

Pro

ba

bili

ty o

f "Y

es

" R

es

po

ns

e

DIF – Location(Item 1)

DIF – Slope(Item 2)

Hispanics

Whites

Whites

Hispanics

Dichotomous Items Showing DIF(2-Parameter Model)

63 04/19/23

Item CalibrationItem Calibration

• Item parameters (threshold, discrimination)Item parameters (threshold, discrimination)

• Mean differences for studied disease groupsMean differences for studied disease groups

64 04/19/23

Questions?

Public website: http://www.nihpromis.org/

Peer-reviewed manuscripts, e.g.:

Hays, R. D. et al. (in press). Item response theory analyses of physical functioning items in the Medical Outcomes Study. Medical Care.

Reeve, B. B., et al. (submitted). Psychometric evaluation and calibration of health-related quality of life items banks: Plans for the Patient-Reported Outcome Measurement Information System (PROMIS)

65 04/19/23

Acknowledgements

“Slides” in this presentation were lifted or adapted from PROMIS presentations by David Cella, Richard Gershon, Bryce Reeve, and Jim Fries.

66 04/19/23

Appendices

Pre-Application Meeting for the RFA-RM-04-011: Dynamic Assessment of Patient-Reported Chronic Disease OutcomesMonday, January 26, 2004Deborah N. Ader, Ph.D. and Lawrence J. Fine, M.D., Dr.PHTotal Running Time: 02:40:08

http://videocast.nih.gov/PastEvents.asp?c=4&s=151

1 9/23/2015 Patient Reported Outcomes Measurement Information System (PROMIS) Patient Reported...

Documents

Transcript of 1 9/23/2015 Patient Reported Outcomes Measurement Information System (PROMIS) Patient Reported...