PBH Benchmarking 20060103c - psychoutcomes.org · PBH does not routinely collect other demographic...
Transcript of PBH Benchmarking 20060103c - psychoutcomes.org · PBH does not routinely collect other demographic...
Benchmarking Depression Treatment 1
Running Head: BENCHMARKING DEPRESSION TREATMENT IN MANAGED CARE
Benchmarking the Effectiveness of Psychotherapy Treatment for Adult Depression in a Managed
Care Environment
Takuya Minami, Department of Educational Psychology, University of Utah
Bruce E. Wampold, Department of Counseling Psychology, University of Wisconsin – Madison
Ronald C. Serlin, Department of Educational Psychology, University of Wisconsin – Madison
Eric G. Hamilton, PacifiCare Behavioral Health, Pittsburgh, Pennsylvania
George S. Brown, Center for Clinical Informatics, Salt Lake City, Utah
John C. Kircher, Department of Educational Psychology, University of Utah
We would like to express our greatest appreciation to PacifiCare Behavioral Health, Inc., for
their permission to utilize their data for this study.
Correspondence should be addressed to Takuya Minami, Department of Educational Psychology,
University of Utah, Salt Lake City, Utah 84112, U.S.A. Email: [email protected].
Benchmarking Depression Treatment 2
Abstract
This study investigated the effectiveness of psychotherapy treatment for adult clinical
depression provided in a general clinical setting, notably a managed care environment, using
benchmarks established from efficacy data of published clinical trials. Overall results suggest
clinical equivalence between the effectiveness of psychotherapy provided in a managed care
environment as compared to efficacy observed in clinical trials, although providers in individual
practice exhibit slightly poorer outcomes than either the clinical trial benchmark or providers in
group practices.
Benchmarking Depression Treatment 3
Benchmarking the Effectiveness of Psychotherapy Treatment for Adult Depression in a Managed
Care Environment
More than a decade has passed since estimating the effectiveness of psychotherapy as it is
delivered in natural settings (i.e., treatment-as-usual, TAU) was proclaimed as one of the most
critical issues in the field of psychotherapy (e.g., Seligman, 1995; Weisz, Donenberg, Han, &
Weiss, 1995). However, most of the studies in the past decade that investigated outcomes in
clinical settings have involved evaluations of empirically supported treatments (ESTs) and other
manualized treatments that have been implemented in clinical settings rather than evaluating the
effectiveness of TAUs. Specifically, ESTs and other manualized treatments have been
implemented in clinical settings for treating many psychological disorders, including
agoraphobia (Hahlweg, Fiegenbaum, Frank, Schroeder, & von Witzleben, 2001), obsessive-
compulsive disorder (Franklin, Abramowitz, Kozak, Levitt, & Foa, 2000; Warren & Thomas,
2001), panic disorder (Addis, Hatgis, Krasnow, Jacob, Bourne, & Mansfield, 2004; García-
Palacios et al., 2002; Wade, Treat, & Stuart, 1998), posttraumatic stress disorder (Gillespie,
Duffy, Hackmann, & Clark, 2002), social phobia (Lincoln et al., 2003), depression (Merrill,
Tolbert, & Wade, 2003; Persons, Bostrom, & Bertagnolli, 1999), substance abuse (Morgenstern,
Blanchard, Morgan, Labouvie, & Hayaki, 2001), criminal offense (Henggeler, Melton, Brondino,
Scherer, & Hanley, 1997), bulimia nervosa (Tuschen-Caffier, Pook, & Frank, 2001), and
psychosis (Morrison et al., 2004). In other words, the past decade experienced a drastic increase
in the dissemination of manualized treatments that have been shown to be effective in clinical
trials to naturalistic settings. This process assumes that TAUs are not as effective as ESTs in
clinical settings and that outcomes can be improved by delivering ESTs (e.g., Hollon, Thase, &
Markowitz, 2002). However, as researchers have continued to conclude that ESTs should be
Benchmarking Depression Treatment 4
disseminated to clinical settings, they have ignored the important question about whether or not
TAUs already achieve outcomes comparable to ESTs (e.g., Addis, 2002; Chorpita et al., 2002;
Herschell, McNeil, & McNeil, 2004; Manderscheid & Henderson, 2004; Stirman, Crits-
Christoph, & DeRubeis, 2004).
At first glance, it appears as though little empirical support exists for the effectiveness of
interventions provided in the community, especially in the area of child and adolescent
psychotherapy (Weiss, Catron, & Harris; 2000; Weiss, Catron, Harris, & Phung, 1999; Weisz &
Weiss, 1989; Weisz, Weiss, & Donenberg, 1992). For example, a benchmarking study
conducted by Weersing and Weisz (2002) described the symptom trajectory of depressed youth
that were provided TAUs in community mental health centers (CMHCs) as resembling that of
what was observed in control groups in clinical trials. However, this study did not provide
unequivocal evidence for the superiority of ESTs over TAUs due to several significant
limitations. First, it is most likely that the adolescents that were treated at the CMHCs were
significantly different from the clients in clinical trials with regard to socioeconomic status, rates
of comorbidity, and other exclusion criteria (Westen, Novotny, & Thompson-Brenner, 2004). In
addition, it is well documented that the therapists’ workload and other work-related
environments are drastically different between clinical trials and clinical settings (Borkovec &
Castonguay, 1998; Rupert & Baird, 2004). Therefore, it is premature to draw conclusions that
ESTs for child and adolescents outperform TAUs.
As for the adult population, not only are there very few studies that have investigated the
effectiveness of TAUs; the results of these studies are mixed. An investigation of TAU marital
therapy conducted in Germany revealed that although significant pre-post effects were found,
their overall effect size was low (Hahlweg & Klann, 1997). On the other hand, TAUs conducted
Benchmarking Depression Treatment 5
in a community-based substance abuse treatment program showed equivalent results with
cognitive-behavioral therapy (CBT) that was implemented in the same setting (Morganstern et
al., 2001). Furthermore, Addis et al. (2004) reported that the delivery of an EST in a managed
care environment, notably panic control therapy (PCT), attained significantly better clinical
outcomes for some variables as compared to TAU; however, the PCT therapists received
additional training and supervision and, in any event, the superiority of PCT over TAU was
small (an effect size in the neighborhood of .15; see Wampold, 2005). Therefore, the inferiority
of TAU in clinical settings has not been conclusively established.
In addition, there have been several methodological problems involved in estimating the
size of effects of TAUs. First, because what is often defined as TAU is idiosyncratic to the study,
simplistic conclusions drawn based on comparisons between ESTs and TAUs are misleading
unless one carefully reviews what the authors defined as TAUs. For example, the “usual
services” that were used in comparison against multisystemic therapy (MST) implemented in
community mental health centers involved only monitoring by probation officers and referrals to
other social services and/or special academic programs (Henggeler et al., 1997). Although the
authors rightfully did not generalize their findings to suggest overall superiority of MST relative
to TAUs in general, the lack of uniform agreement about the nature of TAUs has contributed to
erroneous perceptions of the effectiveness of TAUs. Secondly, even if TAUs are bona fide
psychotherapies, most comparisons against implemented ESTs are unbalanced. For example,
therapists who are under the EST conditions receive additional training and supervision that are
not offered to therapists under the TAU conditions, not to mention the potential demand
characteristics of the study (e.g., Addis et al., 2004; Merrill, Tolbert, & Wade, 2003; Wade, Treat,
& Stuart, 1998; see also Westen, Novotny, & Thompson-Brenner, 2004). Clearly, care must be
Benchmarking Depression Treatment 6
taken in defining and operationalizing TAUs to be able to make valid conclusions about the
effectiveness of psychotherapy in clinical settings.
Recently, a promising method was introduced that allows for evaluation of psychotherapy
effectiveness without altering any aspect of TAUs by using benchmarks created from clinical
trials. Specifically, benchmarking allows pre-post outcome data in clinical settings to be
compared against pre-post outcome data from clinical trials (e.g., Merrill, Tolbert, & Wade,
2003; Wade, Treat, & Stuart, 1998; Weersing & Weisz, 2002). Benchmarking involves the
following three steps: (a) calculation of benchmarks by aggregating pre-post effect sizes
observed in clinical trials, (b) calculation of an effect size in the clinical setting, and (c) statistical
comparisons between the benchmarks and the clinical settings effect size (Minami, Serlin,
Wampold, Kircher, & Brown, 2005; Minami, Wampold, Serlin, Kircher, & Brown, 2005). Thus,
this strategy allows for direct statistical evaluation of effectiveness by comparing effects
produced in clinical settings to a rigorous standard established by clinical trials.
The purpose of the current study was to evaluate the effectiveness of TAUs delivered in a
managed care health organization (HMO) using a benchmarking strategy. Specifically, a subset
of the HMO data containing adult clients diagnosed with major depressive disorder (MDD;
American Psychiatric Association, 1994) was statistically compared against benchmarks of adult
depression treatment derived from clinical trials conducted to evaluate the effectiveness of
treatment for depression (Minami, Serlin, et al., 2005; Minami, Wampold, et al., 2005). In
addition, the effect of possible moderators of treatment effectiveness (i.e., individual versus
group providers, medication use) was examined.
Method
Participants
Benchmarking Depression Treatment 7
Base HMO data. The original database (labeled as the “base HMO data”) for this study
contained client outcome data for 48,038 adult clients who received treatment from 6,007
treatment providers1 between February 8, 1999 and December 31, 2004, under the insurance
coverage of PacifiCare Behavioral Health, Inc. (PBH). Available demographics are provided in
Tables 1 and 2. PBH does not routinely collect other demographic variables of clients and
providers such as race/ethnicity, education, and income, and thus were unavailable. The base
HMO data was reduced based on inclusion and exclusion criteria as explained below.
Outcome Measure
Outcome Questionnaire – 30.12 (OQ-30; Lambert, Hatfield, Vermeersch, Burlingame,
Reisinger, & Brown, 2001) was used to assess outcomes of clients included in the HMO dataset.
This instrument was a shortened version of the Outcome Questionnaire – 45.2 (OQ-45;
Vermeersch, Lambert, & Burlingame, 2000; Wells, Burlingame, Lambert, & Hoag, 1996), which
was designed to measure patient progress in three dimensions: (a) subjective discomfort, (b)
interpersonal relationships, and (c) social role performance. The OQ-45 was designed to be a
low-cost, brief-but-broad measure, which was sensitive to short-term changes yet reliable and
valid. The OQ-30 is a briefer version of the OQ-45 that was specifically created for clients to
conveniently complete multiple times. Lambert et al. reported high internal consistency and test-
retest reliability as well as high concurrent validity with other symptom measures, such as the
Inventory of Interpersonal Problems (Horowitz, Rosenberg, Baer, Ureno, & Villasenor, 1988),
Social Adjustment Scale (Weissman & Bothwell, 1976), and Beck Depression Inventory (BDI;
Beck & Steer, 1987).
Procedure
Benchmarking Depression Treatment 8
Initial data collection. Clients were asked to fill out the OQ-30 before their first, third,
and fifth sessions, as well as the every fifth session thereafter; this assessment has been
implemented system-wide at PBH as their routine assessment, and has a current participation rate
of approximately 70%. Whereas the clinical trials have the ease of defining episodes of care as
the period between when the participants entered the clinical trial and when they “completed” or
“dropped out,” episodes cannot be as clearly defined in clinical settings. Therefore, for the
current study, an episode of care was defined as the cluster of outcome assessment points that do
not have more than a 90-day gap between two observations. In other words, if any two
observations were more than 90 days apart, the former observation was deemed to be the score
of the last session of an episode, and the latter observation was considered the intake score of the
next episode.
Data reduction. The base HMO data was reduced to match, as best possible, the clinical
population represented in the clinical trials that investigate the efficacy of psychotherapy
treatments for adult depression. First, for the purpose of the study, only the first episode of care
for a given client was included so as to maintain independence of observations as best possible.
The database was further reduced based on the clients’ demographics and severity of depression
to match those of the clinical trials that were used to create the benchmarks. Thus, clients were
included in the subset data if they met all three of the following criteria: (a) client age of 18 years
or older, (b) a primary diagnosis of MDD, and (c) a score of 43 or above on the OQ-30, which
serves as the clinical cutoff score based on Jacobson and Truax’s (1991) formula (Lambert et al.,
2001). Client data with regards to concurrent substance abuse, other comorbidity (e.g., psychotic
features or personality disorders), or suicide ideation was unavailable, and thus were not used as
exclusion criteria although such exclusion criteria are typically used in clinical trials of
Benchmarking Depression Treatment 9
depression. Applying these criteria resulted in a subset database of outcomes for 6,323 adult
clients with depression who received treatment from 2,001 providers (i.e., subset HMO data).
All available demographic and clinical information from the base and subset HMO data with
regard to the clients and therapists are also provided in Tables 1 and 2, respectively. No data
were available on race/ethnicity and other demographics of both the providers and clients, as the
HMO does not routinely collect this information.
PLEASE INSERT TABLES 1 AND 2 AROUND HERE
Subset HMO Data Effect Size Calculation
Treatment effect size of the subset HMO data was calculated following basic meta-
analytic procedures (Becker, 1988; Hedges & Olkin, 1985). Specifically, where 1M and 2M are
the intake OQ-30 and last available OQ-30 means, respectively, and 1SD is the standard
deviation of the intake score, the biased estimator HMOg is
1
21
SDMMgHMO
−= . (1)
Here, the standard deviation of the intake score was used rather than a pooled standard deviation
because it is presumably less influenced by repeated testing and/or treatment, presenting a less
confounded value (Becker). Using the correction, as derived by Hedges and Olkin, the unbiased
estimate of the effect size HMOd , to be benchmarked, is
HMOHMO gN
d ⎟⎠⎞
⎜⎝⎛
−−=
5431 , (2)
where N is the sample size of the clinical settings data. The estimated variance of HMOd is
( )( )
Nd
Nrσ HMO
HMOd 212ˆ
2122 +
−= , (3)
Benchmarking Depression Treatment 10
where 12r is the estimated correlation between the intake and last available scores (Becker). For
the current sample, the correlation of the intake and last available score was .4966 and this value
was used for 12r in equation 3.
Clinical Trials Benchmarks
Benchmarks for both the treatment efficacy of psychotherapy for adult depression and
natural history of depression were derived by Minami, Wampold, et al. (2005). The treatment
efficacy benchmarks were derived by meta-analytically aggregating pretest to last available
assessment effects (i.e., both completers and intent-to-treat samples change effects) in published
clinical trials of psychotherapy treatment for adult depression. The natural history benchmarks
were constructed methodologically identical to the treatment efficacy benchmarks, but with
symptom trajectory of wait-list control groups. For this study, intent-to-treat benchmarks that
aggregated outcomes of self-report measures assessing broad symptoms were selected in order to
make valid comparison, as the HMO data set used a global well-being measure (viz., the OQ-30)
and contained all patients who entered treatment and completed at least two outcome measures.
Accordingly, the treatment efficacy benchmark was ( ) 831.0=TEBd and the natural history
benchmark was ( ) 122.0=NHBd for global measure for intent-to-treat samples (Minami,
Wampold, et al.). The mean numbers of weeks in treatment in the clinical trials were
approximately 16 for the efficacy benchmark and 10 for the natural history benchmark (Minami,
Wampold, et al.).
Benchmarking
Testing against the treatment efficacy benchmark. The subset HMO data was tested
using the benchmarking strategy illustrated in Minami, Serlin, et al. (2005) against the treatment
efficacy benchmark ( ) 831.0=TEBd . This strategy tests the true effect size in the population as
Benchmarking Depression Treatment 11
represented by the clinical settings data against a critical value derived from the benchmark,
taking into consideration a predetermined margin of 2.0=d between the benchmark and the
population to claim clinical equivalence while maintaining an overall Type I error of .05. In
other words, if the clinical settings effect was within 2.0=d below the efficacy benchmark (i.e.,
631.0=d ), the population effect size as represented by this data was considered clinically
equivalent to the efficacy benchmark. The margin of 2.0=d was selected based on Cohen’s
(1988) suggestion that this magnitude of effect size is small, and therefore, any differences
between the benchmarks and the population effect size that were within 1/5th of a standard
deviation was considered to be clinically trivial (Minami, Wampold, et al., 2005).
To statistically compare the population effect size represented by the subset HMO data
against the benchmark taking into consideration the 2.0=d margin, the “good-enough
principle” as illustrated in Serlin and Lapsley (1985, 1993) was used. This procedure allows for
hypothesis testing with a range-null rather than a point-null hypothesis, while maintaining an
overall Type I error of .05. Specifically, with subset HMO effect size data HMOd , its sample
size N , and 2.0=d margin ∆ , the noncentral t test statistic HMOt has a noncentrality parameter
( )( )∆−= TEBTE δNλ , (4)
where ( )TEBδ is the true treatment efficacy benchmark. When ν is the degrees of freedom (i.e.,
1−= Nν ), HMOt will be tested against the noncentral t critical value αλν :,t at 05.=α .
Testing against the natural history benchmark. For the population effect size to claim
any effectiveness over and above the natural symptom trajectory of depression, the clinical
setting effect must exceed at minimum 2.0=d above the natural history benchmark
( ) 122.0=NHBd (i.e., 322.0=d ). Thus, using an identical method as illustrated for comparison
Benchmarking Depression Treatment 12
against the treatment efficacy benchmark other than the direction of the margin, the noncentral t
test statistic HMOt has a noncentrality parameter
( )( )∆+= NHBNH δNλ , (5)
where ( )NHBδ is the true depression natural history benchmark. For this analysis, HMOt will be
tested against the noncentral t critical value αλν :,t at 05.=α . Again, however, whether or not
the population effect size as represented by the subset HMO data is statistically and clinically
superior to the natural trajectory of depression could be determined by visually determining the
figure in Minami, Wampold, et al. (2005) without actual calculation.
Results
Subset HMO Effect Size
The 323,6=N adult clients who had clinical depression in the subset HMO data had
mean intake and last session scores of 58.631 =M ( 75.12=SD ) and 19.542 =M ( 68.15=SD ),
respectively. Thus, the effect size HMOd was
7360.075.12
19.5458.63563234
31 =−
⋅⎟⎠⎞
⎜⎝⎛
−⋅−=HMOd . (6)
As 4966.12 =r in this subset HMO data, the variance ( )2ˆ HMOdσ was
( )( ) ( ) 0002021.0
632327360.0
63234966.12ˆ
22 =
⋅+
−=HMOdσ . (7)
Benchmarking
Benchmarking the overall subset HMO effect size. With the subset HMO effect size of
7360.0=HMOd with 323,6=N , the test statistic HMOt was
53.587360.06323 =⋅=HMOt . (8)
Benchmarking Depression Treatment 13
When compared against the treatment efficacy benchmark, it had a noncentrality parameter
( ) 18.502.08310.06323 =−=TEλ . (9)
Tested against the noncentral t critical value 99.5195.:18.50,6323 =t , HMOt was statistically
significant at 0001.<p . That is, the overall subset HMO effect size 7360.0=HMOd was
clinically equivalent to the treatment efficacy benchmark ( ) 831.0=TEBd . Here, with 323,6=N ,
the 95th percentile one-tailed critical value that the subset HMO effect size needed to exceed to
claim clinical equivalence with the treatment efficacy benchmark was
( ) 6538.06323
99.5195.:18.50,6323 ===N
td TECV (10)
(Minami, Serlin, et al., 2005). Clearly, the subset HMO effect size 7360.0=HMOd exceeded the
critical value.
When compared against the natural history benchmark ( ) 122.0=NHBd , 53.58=HMOt had
a noncentrality parameter
( ) 57.252.01216.06323 =+=NHλ . (11)
Tested against the noncentral t critical value 26.2795.:57.25,6323 =t , HMOt was statistically
significant at 0001.<p . For reference, the 95th percentile one-tailed critical value for the
clinical settings effect size to exceed to claim clinical effectiveness over and above the natural
trajectory of depression was ( ) 3429.0=NHCVd (Minami, Serlin, et al., 2005). Thus, the primary
conclusion of this analysis is that providers in clinical practice treating major depression attain
outcomes comparable to those achieved by treatments provided in clinical trials and surpass
improvement inherent in the natural course of depression.
Benchmarking Depression Treatment 14
Moderator analyses of the subset HMO data. As some clinical characteristics were
potential moderators, the subset HMO data was further divided based on the following factors:
(a) providers’ practice context (i.e., individual or group practice providers) and (b) client
medication use. The results of these analyses are shown in Table 3. Although the subset HMO
effect size as a whole cleared the treatment efficacy benchmark, the effect size was significantly
impacted by both providers’ practice and concurrent medication use. Although clients in group
practices cleared the treatment efficacy benchmark regardless of whether or not the clients were
on medication, clients treated by individual providers cleared the benchmark only with
concurrent medication use (see Table 4).
PLEASE INSERT TABLES 3 AND 4 AROUND HERE
Because effect size is affected by initial severity (i.e., typically, those who are initially
more severe demonstrate larger effects given sufficient number of sessions; Garfield, 1986;
Lambert, 2001), two subgroups within the individual practice/no medication condition were
further analyzed. For the first subset, we selected only those clients who collectively met the
initial severity observed in the individual practice/medication condition (i.e., intake 5.65≈M ).
This subset within the individual practice/no medication condition exceeded the critical value
6865.0=CVd ( 092,1=N , 0001.<p ), indicating that among clients with average intake scores
at this severity, providers in individual practice collectively performed equivalently as compared
to clinical trials, even without medication. However, with the second subset that matched the
intake severity to that observed among group practice/no medication condition (i.e., intake
3.61≈M ), providers in individual practices did not exceed the critical value.
As the providers in individual practice treating clients who were not on medication did
not perform clinically equivalent to the clinical trials, we estimated the percentage of providers
Benchmarking Depression Treatment 15
in this condition that collectively exceed the critical value as proposed a priori. In other words,
we sought to investigate what percentage of the providers, performing on the poorer end, needed
to be excluded in order for the set of providers in individual practice to meet the critical value to
claim clinical equivalence with the clinical trials. This analysis showed that excluding the
poorest functioning five percent of providers in individual practice was sufficient for these
providers to meet the required treatment benchmark.
Discussion
There has been a dearth of studies investigating the effectiveness of TAUs delivered in
clinical settings. To our knowledge, the present article reports the first benchmarking study of
TAUs for the psychotherapy treatment of adult clinical depression. Notable is the use of
benchmarks for treatment and natural history derived meta-analytically from clinical trials and
the use of the range-null hypothesis testing procedure (Serlin & Lapsley, 1985, 1993), which
allowed for a 2.0=d margin between the benchmarks and the subset HMO data to claim
clinical indifference between the two.
The results of the present study clearly demonstrated the general effectiveness of
psychotherapy treatment for adult depression provided in general clinical settings. The
assumption that providers in clinical practice produce outcomes inferior to what would be
accomplished had these providers used an EST for depression seems to be unwarranted, given
that the providers matched the effects produced by clinical trials.
An interesting result is that providers in group practices attain better outcomes than
providers in individual practice. Although there is no data in the present study that provides an
explanation for this result, several interesting conjectures are apparent. There may be something
intrinsic to group practices that augments the performance of their members, such as the
Benchmarking Depression Treatment 16
availability of colleague consultation or multidisciplinary approaches. On the other hand, the
explanation may involve selection; better therapists either select group practices or group
practices select better therapists. Finally, because the present study was naturalistic (i.e.,
patients were not randomly assigned to therapists), it may be that patients with better prognoses
select group practices. However, even the providers in individual practice achieve commendable
outcomes when one considers the fact that in clinical trials therapists typically are selected for
their expertise and are trained, monitored, and supervised (e.g., Rounsaville, O’Malley, Foley, &
Weissman, 1988; see also Rupert & Baird, 2005; Westen et al., 2004). In the present study,
eliminating the poorest functioning five percent allowed the providers in individual practice to
achieve clinical equivalence to the clinical trial benchmark; we speculate that clinical trialists are
more selective than this. Given that there is significant variation in outcomes attributable to
therapists in clinical trials and in managed care settings (Crits-Christoph, Barnackie, Kurcias,
Beck, Carroll, Perry, et al., 1991; Wampold & Brown, 2005), the selection of therapists in
clinical trials would tend to augment the effects in that context.
It is also important to note that whereas the mean number of weeks in treatment in the
clinical trials included in the efficacy benchmark was approximately 16, the mean number of
weeks for the subset HMO data was less than 9, with a median of 6. This comparison further
strengthens the evidence of effectiveness in clinical settings. Although an argument could be
made that most of the change in psychotherapy occurs early in treatment (Barkham, Rees, Stiles,
Shapiro, Hardy, & Reynolds, 1996; Howard, Kopta, Krause, & Orlinsky, 1986), the shorter
length in treatment nevertheless has profound impact on both the well being of the client and cost
effectiveness.
Benchmarking Depression Treatment 17
It also appears that the concurrent administration of medications augments the effects of
psychotherapy, a result consistent with the conclusions of some research (Thase & Jindal, 2004).
However, this result appears to be due, in a large part, to the fact that those on medication are
initially more severely dysfunctional. Again, because of the lack of random assignment, making
a strong conclusion about the effects of medication is precluded.
There are limitations that need to be considered in interpreting the results of the present
study. First of all, the treatment efficacy and natural history benchmarks that were used in this
study, despite being the best indexes currently available, are a compilation of various self-report
outcome measures that assess global symptoms (e.g., SCL-90; Derogatis, 1977). Therefore, as
the subset HMO data used the OQ-30, outcome measures for the benchmarks and the effects in
clinical practice are not exact matches and thus differences among the measures could potentially
impact the results. However, these benchmarks appeared to be the most representative to
compare against, given that the OQ-30 is also a self-report measure of global symptoms.
It could be claimed that the observed effectiveness demonstrated in the managed care
context was solely an artifact of regression to the mean. Because initial severity is correlated
with pre to posttest effect sizes, valid comparison of the clinical effect in the HMO data to the
benchmark assumes equal severity. Care was taken to only include clients who were given a
MDD diagnosis and who were in the clinical range of the OQ, a strategy that simulates the
inclusion criteria for clinical trials of depression. However, the equivalence of initial severity in
the benchmark samples with that of the clinical sample is unknown. However, although this
criticism cannot be completely refuted, we also benchmarked the subset HMO data with the
natural history benchmark, which aggregated pre-post effect sizes of wait-list control conditions.
Under the assumption that the clinical conditions between treatments and wait-list controls in the
Benchmarking Depression Treatment 18
clinical trials were sufficiently equivalent, it could be inferred that the natural history benchmark
would represent the combination of natural remission of depression and the regression artifact. It
is important to note that clinical equivalence as demonstrated using the benchmarking strategy
does not explain away other differences between clinical settings and clinical trials. Typically,
naturalistic practice settings and research environments are quite different with regard to client
and therapist factors such as heterogeneity among clients, funding structure, supervision and
training, length of treatment, and clinical caseload (Nathan, Stuart, & Dolan, 2000; Rounsaville,
O’Malley, Foley, & Weissman, 1988; Rupert & Baird, 2004; Seligman, 1995; Wampold, 1997,
2001; Westen & Morrison, 2001; Westen et al., 2004). In fact, it is the incorporation of these
documented differences in interpreting our results that further strengthen our conclusion. For
example, Westen and Morrison reported that approximately 70% of clients are screened out in
clinical trials based on their strict inclusion and exclusion criteria, whereas in the clinical settings,
such criteria are ethically impermissible. In addition, Rupert and Baird have documented the
client load and lack of supervision available in the general clinical settings, which is in stark
contrast to the conditions of therapists participating in rigorously conducted clinical trials who
receive additional training and supervision. Despite these inequalities in conditions, the results
of this study is reassuring in that adult clients receiving psychotherapy in general clinical settings
for depression are most likely receiving quality care.
Finally, the moderator analyses are difficult to interpret because clients were not
randomly assigned to conditions (i.e., were not randomly assigned to group or individual
practices or to medication or no medication conditions). Consequently, the results of the
moderator analyses need to be considered tentative.
Benchmarking Depression Treatment 19
The increased demand to demonstrate accountability and effectiveness of clinical practice
has put pressure on clinical settings to assess clinical outcomes. However, there are growing
concerns among therapists regarding this issue. For example, Hahlweg and Klann (1997)
reported that the response rate of therapists who were asked to measure outcomes were low,
possibly due to the anxiety that the results may be used for administrative purposes (e.g.,
promotion and/or retention). In addition, Plante, Andersen, and Boccaccini (1999), in their
survey of Clinical Diplomates of the American Board of Professional Psychology, reported that
many considered routine use of outcome measures as too lengthy and unnecessary. Although
more effectiveness studies are necessary, the above concerns must be addressed adequately so
that therapists would willingly participate in outcome assessments. Without active participation
from therapists, effectiveness cannot be adequately assessed.
Simultaneously, the current trend to implement ESTs and other manualized treatments in
clinical settings is also met with resistance by the therapists, as many perceive such
implementation as a hindrance to their autonomy and creativity (e.g., Plante et al., 1999).
Moreover, it is impossible to determine whether or not ESTs were satisfactorily implemented
without monitoring therapists’ adherence, which, to do so would be highly impractical
considering the cost. For clinicians that are faced with choosing between measuring their
clinical outcomes and abandoning their TAUs in favor of ESTs, measuring outcomes may in fact
appear favorable. Specifically, if providers in clinical settings could demonstrate that they are
attaining outcomes clinically equivalent to efficacy observed in clinical trials, the current
rationale for implementing ESTs, which is to ensure accountability, becomes illogical. After all,
it is the results with clients in the field that is the goal of treatment; if providers can document
Benchmarking Depression Treatment 20
that they are achieving desired outcomes, it makes little sense to suggest that they adopt
particular treatments.
Benchmarking Depression Treatment 21
References
Addis, M. E. (2002). Methods for disseminating research products and increasing evidence-
based practice: Promises, obstacles, and future directions. Clinical Psychology: Science
and Practice, 9, 367-378.
Addis, M. E., Hatgis, C., Krasnow, A. D., Jacob, K., Bourne, L, & Mansfield, A. (2004).
Effectiveness of cognitive-behavioral treatment for panic disorder versus treatment as
usual in a managed care setting. Journal of Consulting and Clinical Psychology, 72, 625-
635.
American Psychiatric Association (1994). Diagnostic and statistical manual of mental disorders
(4th ed.). Washington, DC: Author.
Barkham, M., Rees, A., Stiles, W. B., Shapiro, D. A., Hardy, G. E., & Reynolds, S. (1996).
Dose-effect relations in time-limited psychotherapy for depression. Journal of
Consulting and Clinical Psychology, 64, 927-935.
Beck, A. T., & Steer, R. A. (1987). Beck Depression Inventory manual. San Antonio, TX:
Harcourt Brace Jovanovich.
Becker, B. J. (1988). Synthesizing standardized mean-change measures. British Journal of
Mathematical and Statistical Psychology, 41, 257-278.
Borkovec, T. D., & Castonguay, L. G. (1998). What is the scientific meaning of empirically
supported therapy? Journal of Consulting and Clinical Psychology, 66, 136-142.
Chorpita, B. F., Yim, L. M., Donkervoet, J. C., Arensdorf, A., Amundsen, M. J., McGee, C.,
Serrano, A., Yates, A., Burns, J. A., & Morelli, P. (2002). Toward large-scale
implementation of empirically supported treatments for children: A review and
Benchmarking Depression Treatment 22
observations by the Hawaii Empirical Basis to Services Task Force. Clinical Psychology:
Science and Practice, 9, 165-190.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ:
Erlbaum.
Crits-Christoph, P., Barnackie, K., Kurcias, J. S., Beck, A. T., Carroll, K., Perry, K., Luborsky,
L., McLellan, T., Woody, G., Thompson, L., Gallager, D., & Zitrin, C. (1991). Meta-
analysis of therapist effects in psychotherapy outcome studies. Psychotherapy Research,
1, 81-91.
Derogatis, L. R. (1977). The SCL-90 Manual I: Scoring, administration and procedures.
Baltimore, MD: Johns Hopkins University School of Medicine, Clinical Psychometrics
Unit.
Franklin, M. E., Abramowitz, J. S., Kozak, M. J., Levitt, J. T., & Foa, E. B. (2000).
Effectiveness of exposure and ritual prevention for obsessive-compulsive disorder:
Randomized compared with nonrandomized samples. Journal of Consulting and Clinical
Psychology, 68, 594-602.
García-Palacios, A., Botella, C., Robert, C., Baños, R., Perpiña, C., Quero, S., & Ballester, R.
(2002). Clinical utility of cognitive-behavioural treatment for panic disorder. Results
obtained in different settings: A research centre and a public mental health unit. Clinical
Psychology and Psychotherapy, 9, 373-383.
Garfield, S. L. (1986). Research on client variables in psychotherapy. In S. L. Garfield and A. E.
Bergin (Eds.), Handbook of psychotherapy and behavior change (3rd ed., pp. 213-256).
New York: John Wiley & Sons.
Benchmarking Depression Treatment 23
Gillespie, K., Duffy, M., Hackmann, A., & Clark, D. M. (2002). Community based cognitive
therapy in the treatment of post-traumatic stress disorder following the Omagh bomb.
Behaviour Research and Therapy, 40, 345-357.
Hahlweg, K., Fiegenbaum, W., Frank, M., Schroeder, B., & von Witzleben, I. (2001). Short- and
long-term effectiveness of an empirically supported treatment for agoraphobia. Journal
of Consulting and Clinical Psychology, 69, 375-382.
Hahlweg, K., & Klann, N. (1997). The effectiveness of marital counseling in Germany: A
contribution to health services research. Journal of Family Psychology, 11, 410-421.
Hamilton, M. A. (1960). A rating scale for depression. Journal of Neurology, Neurosurgery,
and Psychiatry, 23, 56-62.
Hamilton, M. A. (1967). Development of a rating scale for primary depressive illness. British
Journal of Social and Clinical Psychology, 6, 278-296.
Hedges, L. V., & Olkin, I. (1985). Statistical methods for meta-analysis. San Diego, CA:
Academic Press.
Henggeler, S. W., Melton, G. B., Brondino, M. J., Scherer, D. G., & Hanley, J. H. (1997).
Multisystemic therapy with violent and chronic juvenile offenders and their families: The
role of treatment fidelity in successful dissemination. Journal of Consulting and Clinical
Psychology, 65, 821-833.
Herschell, A. D., McNeil, C. B., & McNeil, D. W. (2004). Clinical child psychology’s progress
in empirically supported treatments. Clinical Psychology: Science and Practice, 11, 267-
288.
Hollon, S. D., Thase, M. E., & Markowitz, J. C. (2002). Treatment and prevention of depression.
Psychological Science in the Public Interest, 3, 39-77.
Benchmarking Depression Treatment 24
Horowitz, L. M., Rosenberg, S. E., Baer, B. A., Ureno, G., & Villasenor, V. S. (1988). Inventory
of interpersonal problems: Psychometric properties and clinical applications. Journal of
Consulting and Clinical Psychology, 56, 885-892.
Howard, K. I., Kopta, S. M., Krause, M. S., & Orlinsky, D. E. (1986). The dose-effect
relationship in psychotherapy. American Psychologist, 41, 159-164.
Jacobson, N. S., & Truax, P. (1991). Clinical significance: A statistical approach to defining
meaningful change in psychotherapy research. Journal of Consulting and Clinical
Psychology, 59, 12-19.
Lambert, M. J. (2001). The status of empirically supported therapies: Comment on Westen and
Morrison’s (2001) multidimensional meta-analysis. Journal of Consulting and Clinical
Psychology, 69, 910-913.
Lambert, M. J., Hatch, D. R., Kingston, M. D., & Edwards, B. C. (1986). Zung, Beck, and
Hamilton Rating Scales as measures of treatment outcome: A meta-analytic comparison.
Journal of Consulting and Clinical Psychology, 54, 54-59.
Lambert, M. J., Hatfield, D. R., Vermeersch, D. A., Burlingame, G. M., Reisinger, C. W., &
Brown, G. S. (2001). Administration and scoring manual for the LSQ (Life Status
Questionnaire). East Setauket, NY: American Professional Credentialing Services.
Lincoln, T. M., Rief, W., Hahlweg, K., Frank, M., von Witzleben, I., Schroeder, B., &
Fiegenbaum, W. (2003). Effectiveness of an empirically supported treatment for social
phobia in the field. Behaviour Research and Therapy, 41, 1251-1269.
Manderscheid, R. W., & Henderson, M. J. (2004). Mental health, United States, 2002 executive
summary. Administration and Policy in Mental Health, 32, 49-55.
Benchmarking Depression Treatment 25
Merrill, K. A., Tolbert, V. E., & Wade, W. A. (2003). Effectiveness of cognitive therapy for
depression in a community mental health center: A benchmarking study. Journal of
Consulting and Clinical Psychology, 71, 404-409.
Minami, T., Serlin, R. C., Wampold, B. E., & Kircher, J. C. (2005). How to benchmark clinical
settings effect sizes against clinical trials. Manuscript submitted for publication.
Minami, T., Wampold, B. E., Serlin, R. C., Kircher, J. C., & Brown, G. S. (2005). Benchmarks
for the treatment of adult depression: Issues and results. Manuscript submitted for
publication.
Morgenstern, J., Blanchard, K. A., Morgan, T. J., Labouvie, E., & Hayaki, J. (2001). Testing the
effectiveness of cognitive-behavioral treatment for substance abuse in a community
setting: Within treatment and posttreatment findings. Journal of Consulting and Clinical
Psychology, 69, 1007-1017.
Morrison, A. P., Renton, J. C., Williams, S., Knight, D. H., Kreutz, M., Nothard, S., Patel, U., &
Dunn, G. (2004). Delivering cognitive therapy to people with psychosis in a community
mental health setting: An effectiveness study. Acta Psychiatrica Scandinavica, 220, 36-
44.
Nathan, P. E., Stuart, S. P., & Dolan, S. L. (2000). Research on psychotherapy efficacy and
effectiveness: Between Scylla and Charybdis? Psychological Bulletin, 126, 964-981.
Persons, J. B., Bostrom, A., & Bertagnolli, A. (1999). Results of randomized controlled trials of
cognitive therapy for depression generalize to private practice. Cognitive Therapy and
Research, 23, 535-548.
Benchmarking Depression Treatment 26
Plante, T. G., Andersen, E. N., & Boccaccini, M. T. (1999). Empirically supported treatments
and related contemporary changes in psychotherapy practice: What do clinical ABPPs
think? Clinical Psychologist, 52, 23-31.
Rounsaville, B. J., O’Malley, S., Foley, S., & Weissman, M. M. (1988). Role of manual-guided
training in the conduct and efficacy of interpersonal psychotherapy for depression.
Journal of Consulting and Clinical Psychology, 56, 681-688.
Rupert, P. A., & Baird, K. A. (2004). Managed care and the independent practice of psychology.
Professional Psychology: Research and Practice, 35, 185-193.
Seligman, M. E. P. (1995). The effectiveness of psychotherapy: The Consumer Reports Study.
American Psychologist, 50, 965-974.
Serlin, R. C., & Lapsley, D. K. (1985). Rationality in psychological research: The good-enough
principle. American Psychologist, 40, 73-83.
Serlin, R. C., & Lapsley, D. K. (1993). Rational appraisal of psychological research and the
good-enough principle. In G. Keren & C. Lewis (Eds.), A handbook for data analysis in
the behavioral sciences: Methodological issues (pp. 199-228). Hillsdale, NJ: Lawrence
Erlbaum Associates.
Shadish, W. R., Matt, G. E., Navarro, A. M., & Phillips, G. (2000). The effects of psychological
therapies under clinically representative conditions: A meta-analysis. Psychological
Bulletin, 126, 512-529.
Shadish, W. R., Matt, G. E., Navarro, A. M., Siegle, G., Crits-Christoph, P., Hazeligg, M. D.,
Jorm, A. F., Lyons, L. C., Nietzel, M. T., Prout, H. T., Robinson, L., Smith, M. L.,
Svartberg, M., & Weiss, B. (1997). Evidence that therapy works in clinically
representative conditions. Journal of Consulting and Clinical Psychology, 65, 355-365.
Benchmarking Depression Treatment 27
Stirman, S. W., Crits-Christoph, P., & DeRubeis, R. J. (2004). Achieving successful
dissemination of empirically supported psychotherapies: A synthesis of dissemination
theory. Clinical Psychology: Science and Practice, 11, 343-359.
Thase, M. E., & Jindal, R. D. (2004). Combining psychotherapy and psychopharmacology for
treatment of mental disorders. In M. J. Lambert (Ed.), Handbook of psychotherapy and
behavior change (5th ed.). New York: John Wiley & Sons.
Tuschen-Caffier, B., Pook, M., & Frank, M. (2001). Evaluation of manual-based cognitive-
behavioral therapy for bulimia nervosa in a service setting. Behaviour Research and
Therapy, 39, 299-308.
Vermeersch, D. A., Lambert, M. J., & Burlingame, G. M. (2000). Outcome Questionnaire: Item
sensitivity to change. Journal of Personality Assessment, 74, 242-261.
Wade, W. A., Treat, T. A., & Stuart, G. L. (1998). Transporting an empirically supported
treatment for panic disorder to a service clinic setting: A benchmarking strategy. Journal
of Consulting and Clinical Psychology, 66, 231-239.
Wampold, B. E. (1997). Methodological problems in identifying efficacious psychotherapies.
Psychotherapy Research, 7, 21-43.
Wampold, B. E. (2001). The great psychotherapy debate: Model, methods, and findings.
Mahwah, NJ: Lawrence Erlbaum Associates.
Wampold, B. E. (2005). Do Therapies Designated as ESTs for Specific Disorders Produce
Outcomes Superior to Non-EST Therapies? Not a scintilla of evidence to support ESTs
as more effective than other treatments. In J. C. Norcross, L. E. Beutler & R. F. Levant
(Eds.), Evidence-based practices in mental health: Debate and dialogue on the
Benchmarking Depression Treatment 28
fundamental questions (pp. 299-308, 317-319) . Washington, DC: American
Psychological Association.
Wampold, B. E., & Brown, G. (2005). Estimating therapist variability in outcomes attributable
to therapists: A naturalistic study of outcomes in managed care. Journal of Consulting
and Clinical Psychology, 73, 914-923.
Warren, R., & Thomas, J. C. (2001). Cognitive-behavior therapy of obsessive-compulsive
disorder in private practice: An effectiveness study. Journal of Anxiety Disorders, 15,
277-285.
Weersing, V. R., & Weisz, J. R. (2002). Community clinic treatment of depressed youth:
Benchmarking usual care against CBT clinical trials. Journal of Consulting and Clinical
Psychology, 70, 299-310.
Weiss, B., Catron, T., & Harris, V. (2000). A 2-year follow-up of the effectiveness of traditional
child psychotherapy. Journal of Consulting and Clinical Psychology, 68, 1094-1101.
Weiss, B., Catron, T., Harris, V., & Phung, T. M. (1999). The effectiveness of traditional child
psychotherapy. Journal of Consulting and Clinical Psychology, 67, 82-94.
Weissman, M. M., & Bothwell, S. (1976). Assessment of social adjustment by patient self-
report. Archives of General Psychiatry, 33, 1111-1115.
Weisz, J. R., Donenberg, G. R., Han, S. S., & Weiss, B. (1995). Bridging the gap between
laboratory and clinical in child and adolescent psychotherapy. Journal of Consulting and
Clinical Psychology, 63, 688-701.
Weisz, J. R., & Weiss, B. (1989). Assessing the effects of clinic-based psychotherapy with
children and adolescents. Journal of Consulting and Clinical Psychology, 57, 741-746.
Benchmarking Depression Treatment 29
Weisz, J. R., Weiss, B., & Donenberg, G. R. (1992). The lab versus the clinic: Effects of child
and adolescent psychotherapy. American Psychologist, 47, 1578-1585.
Wells, M. G., Burlingame, G. M., Lambert, M. J. & Hoag, M. (1996). Conceptualization and
measurement of patient change during psychotherapy: Development of the Outcome
Questionnaire and Youth Outcome Questionnaire. Psychotherapy, 33, 275-283.
Westen, D. & Morrison, K. (2001). A multidimensional meta-analysis of treatments for
depression, panic, and generalized anxiety disorder: An empirical examination of the
status of empirically supported therapies. Journal of Consulting and Clinical Psychology,
69, 875-899.
Westen, D., Novotny, C. M., & Thompson-Brenner, H. (2004). The empirical status of
empirically supported psychotherapies: Assumptions, findings, and reporting in
controlled clinical trials. Psychological Bulletin, 130, 631-663.
Benchmarking Depression Treatment 30
Footnotes
1Treatment providers include providers who practice individually (i.e., individual
providers) and those who are in group practice (group providers). Group providers have an ID
solely for their group, and thus do not have ID numbers for each practicing provider within the
group.
2Due to licensing agreement, the OQ-30 is named the Life Status Questionnaire (LSQ) at
PacifiCare Behavioral Health, Inc.
Benchmarking Depression Treatment 31
Table 1
Client Demographic Information of the Base and Subset HMO Data
Base HMO Data Subset HMO Data
Clients N (%) 48,038 (100.00) 6,323 (13.16a)
Female n (%) 32,713 (68.10) 4,514 (71.39)
Age M (SD, Range, Mdn) 39.75 (11.54, 18 - 96, 39) 39.99 (11.19, 18 - 86, 40)
Diagnosis
Depression n (%) 9,024 (18.79) 6,323 (70.07a)
Adjustment n (%) 5,450 (11.35) -
Anxiety n (%) 2,324 (4.84) -
Other n (%) 2,380 (4.95) -
Unknown n (%) 28,860 (60.08) -
Treatment
Sessions M (SD, Range, Mdn) 9.22 (9.10, 1-160, 6) 8.78 (8.61, 1 - 127, 6)
Provider Training Level
Masters n (%) 19,185 (39.94) 1,884 (46.81)
Doctoral n (%) 8,696 (18.10) 975 (24.22)
Medical n (%) 2,695 (5.61) 99 (2.46)
Other/Unknown n (%) 17,462 (36.35) 1,067 (26.51)
Medication
Yes n (%) 8,355 (17.39) 3,645 (57.65)
No n (%) 9,986 (20.79) 2,263 (35.79)
Unknown n (%) 29,697 (61.82) 415 (6.56)
aPercentage of base HMO data.
Benchmarking Depression Treatment 32
Table 2
Provider Demographic Information of the Base and Subset HMO Data
Base HMO Data Subset HMO Data
Providers N (%) 6,007 (100.00) 2,001 (33.31a)
Individual Practicea n (%) 5,911 (98.40) 1,920 (95.95)
Female n (%) 2,404 (40.67) 879 (45.78)
Male n (%) 1,304 (22.06) 450 (23.44)
Gender Unknown n (%) 2,203 (37.27) 591 (30.78)
Training Level
Masters n (%) 2,300 (38.91) 814 (42.40)
Doctoral n (%) 1,231 (20.83) 459 (23.91)
Medical n (%) 166 (2.81) 47 (2.45)
Other/Unknown n (%) 2,214 (37.46) 600 (31.25)
Years in Practice M (SD, Range, Mdn) 22.41 (7.89, 4 - 53, 22) 22.05 (7.85, 4 - 50, 22)
aPercentage of total number of providers (i.e., N = 6,007). bGroup practices do not have
individual IDs for their therapists, and thus all following demographics pertain to therapists in
individual practice unless otherwise noted.
Benchmarking Depression Treatment 33
Table 3
Subset HMO Data Benchmarking
vs. Treatment Efficacy vs. Natural History
Condition N Intake M (SD) Last M (SD) d ( )TECVd p ( )NHCVd p
Overall 6,323 63.58 (12.75) 54.19 (15.68) 0.7360 0.6538 < .0001 0.3429 < .0001
Practice Individual 4,025 63.43 (12.90) 55.08 (15.36) 0.6482 0.6597 .1617 - -
Group 2,298 63.82 (12.48) 52.65 (16.13) 0.8950 0.6691 < .0001 - -
Medication Concurrent 3,645 65.40 (12.91) 54.98 (16.17) 0.8071 0.6611 < .0001 - -
None 2,263 60.54 (11.80) 52.75 (14.63) 0.6601 0.6693 .1055 - -
Unknown 415 64.08 (13.15) 55.15 (16.40) 0.6778 0.7221 .1757 - -
Note. Client sample size, intake score means (standard deviations), last score means (standard deviations), effect sizes (i.e., d), critical
values (i.e., dCV), and significance level (i.e., p) are for their respective conditions. Hyphen denotes analyses that were not conducted.
Benchmarking Depression Treatment 34
Table 4
Subset HMO Data Benchmarking by Provider Practice and Medication
Session
Medication Client N Intake M (SD) Last M (SD) d CVd p
Individual Practice
Concurrent 2,211 65.59 (13.15) 56.30 (15.82) 0.7065 0.6698 .0008
No Medication 1,552 60.18 (11.72) 53.05 (14.24) 0.6082 0.6774 .7964
(Concurrenta) 1,092 65.46 (9.88) 56.49 (14.00) 0.9073 0.6865 < .0001
(Groupb) 1,453 61.32 (11.24) 53.81 (14.06) 0.6677 0.6790 .1038
Unknown 262 64.58 (13.48) 56.78 (16.48) 0.5774 0.7466 .7914
Group Practice
Concurrent 1,434 65.13 (12.54) 52.96 (16.49) 0.9698 0.6793 < .0001
No Medication 711 61.32 (11.95) 52.08 (15.41) 0.7721 0.7001 .0005
Unknown 153 63.22 (12.58) 52.36 (15.93) 0.8593 0.7843 .0008
aInitial OQ-30 severity matched with clients who are concurrently on medication and treated by individual providers. bInitial OQ-30
severity matched with clients who are not on medication and treated by group providers.