Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers!...

39
Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    213
  • download

    0

Transcript of Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers!...

Page 1: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

Dynamic Treatment Regimes,STAR*D & Voting

D. Lizotte, E. Laber & S. Murphy

LSU ---- Geaux Tigers!

April 2009

Page 2: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

2

Outline

• Dynamic Treatment Regimes

• Constructing Regimes from Data

• A Measure of Confidence: Voting

• STAR*D

Page 3: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

3

Dynamic treatment regimes are individually tailored treatments, with treatment type and dosage changing according to patient outcomes. Operationalize clinical practice.

k Stages for one individual

Observation available at jth stage

Action at jth stage (usually a treatment)

Page 4: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

4

Goal: Construct decision rules that input information available at each stage and output a recommended decision; these decision rules should lead to a maximal mean Y. Y is a known function of

The dynamic treatment regime is the sequence of two decision rules:

k=2 Stages

Page 5: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

5

Action ActionObservations Observations Reward

Stage 1 Stage 2 Stage 1 Stage 2

Deriving the Optimal Dynamic Regime: Move Backwards Through Stages.

You know multivariate distribution

Page 6: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

6

Optimal Dynamic Treatment Regime

satisfies

Page 7: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

7

Data for Constructing the Dynamic Treatment Regime:

Subject data from sequential, multiple assignment, randomized trials. At each stage subjects are randomized among alternative options.

Aj is a randomized treatment with known randomization probability.

Page 8: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

8

Stage 1 Intermediate Stage 2Preference Treatment Outcome Preference Treatment

Bup Continue Remission on Present

Switch R Ven Treatment

Ser MIRT Switch R

+ Bup No NTPAugment R Remission

+ Bus +LI

Augment R +THY

STAR*D

Page 9: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

9

STAR*D Analyses

• X1 includes site, preference for future treatment and can include other baseline variables.

• X2 can include measures of symptoms (Qids), side effects, preference for future treatment

• Y is (reverse-coded) the minimum of the time to remission and 30 weeks.

Page 10: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

10

Outline

• Dynamic Treatment Regimes

• Constructing Regimes from Data

• A Measure of Confidence: Voting

• STAR*D

Page 11: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

11

Regression-based methods for constructing decision rules

•Q-Learning (Watkins, 1989) (a popular method from computer science)

•Optimal nested structural mean model (Murphy, 2003; Robins, 2004)

• The first method is equivalent to an inefficient version of the second method, if we use linear models and each stages’ covariates include the prior stages’ covariates and the actions are centered to have conditional mean zero.

Page 12: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

12

There is a regression for each stage.

A Simple Version of Q-Learning –

• Stage 2 regression: Regress Y on to obtain

• Stage 1 regression: Regress on to obtain

Page 13: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

13

for patients entering stage 2:

• is the average outcome conditional on patient history (no remission in stage 1; includes past treatment and variables affected by stage 1 treatment).

• is the estimated average outcome assuming the “best” treatment is provided at stage 2 (note max in formula).

• is the dependent variable in the stage 1 regression for patients moving to stage 2

Page 14: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

14

Optimal Dynamic Treatment Regime

satisfies

Page 15: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

15

A Simple Version of Q-Learning –

• Stage 2 Q function, (Y was dependent variable) yields

• Stage 1 Q function, ( was dependent variable) yields

Page 16: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

16

Decision Rules:

Page 17: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

17

Outline

• Dynamic Treatment Regimes

• Constructing Regimes from Data

• A Measure of Confidence: Voting

• STAR*D

Page 18: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

18

Measures of Confidence

• Classical

– Confidence/Credible intervals and/or p-values concerning the β1, β2.

– Confidence/Credible intervals concerning the average response if is used in future to select the treatments.

Page 19: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

19

A Measure of Confidence for use in

Exploratory Data Analysis

• Replication Probability

– Estimate the chance that a future trial would find a particular stage j treatment best for a given sj. The vote for treatment aj* is

Page 20: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

20

A Measure of Confidence for use in

Exploratory Data Analysis

Replication Probability

– If stage j treatment aj is binary, coded in {-1,1}, then

Page 21: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

21

Bootstrap Voting

Use bootstrap samples to estimate

by

Page 22: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

22

The Vote: Intuition

If has a normal distribution with variance matrix then

is

Page 23: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

23

Bootstrap Voting

The naïve bootstrap vote estimator

is inconsistent.

Page 24: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

24

Bootstrap Voting

A consistent bootstrap vote estimator of

is

where is smooth and

Page 25: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

25

Bootstrap Voting

In our simple example

is approximately

Page 26: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

26

What does the vote mean?

• is similar to the p-value for the hypothesis in that it converges, as n increases, to 1 or 0 depending on the sign of

• If then the limiting distribution is not uniform; instead converges to a constant.

Page 27: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

27

Outline

• Dynamic Treatment Regimes

• Constructing Regimes from Data

• A Measure of Confidence: Voting

• STAR*D

Page 28: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

28

Stage 1 Intermediate Stage 2Preference Treatment Outcome Preference Treatment

Bup Continue Remission on Present

Switch R Ven Treatment

Ser MIRT Switch R

+ Bup No NTPAugment R Remission

+ Bus +LI

Augment R +THY

STAR*D

Page 29: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

29

STAR*D

Regression formula at stage 2:

Page 30: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

30

Page 31: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

31

STAR*D

Regression formula at stage 1:

Page 32: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

32

STAR*D

Decision Rule for subjects preferring a switch at stage 1

• if offer VEN

• if offer SER

• if offer BUP

Page 33: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

33

STAR*D Level 2, Switch

Page 34: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

34

Page 35: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

35

Page 36: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

36

Truth in Advertising:STAR*D

Missing Data + Study Drop-Out

• 1200 subjects begin level 2 (e.g. stage 1)

• 42% study dropout during level 2

• 62% study dropout by 30 weeks.

• Approximately 13% item missingness for important variables observed after the start of the study but prior to dropout.

Page 37: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

37

Truth in Advertising:STAR*D

Multiple Imputation within Bootstrap

• 1000 bootstrap samples of the 1200 subjects

• Using the location-scale model we formed 25 imputations per bootstrap sample.

• The stage j Q-function (regression function) for a bootstrap sample is the average of the 25 Q-functions over the 25 imputations.

Page 38: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

38

Discussion

• We consider the use of voting to provide a measure of confidence in exploratory data analyses.

• Our method of adapting the bootstrap voting requires a tuning parameter, γ. It is unclear how to best select this tuning parameter.

• We ignored the bias in estimators of stage 1 parameters due to the fact that these parameters are non-regular. The voting method should be combined with bias reduction methods.

Page 39: Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.

39

This seminar can be found at:http://www.stat.lsa.umich.edu/~samurphy/

seminars/LSU2009.ppt

Email me with questions or if you would like a copy!

[email protected]