Reliability of a Treatment-Based Classification System for Subgrouping People With Low Back Pain -...

9
journal of orthopaedic & sports physical therapy | volume 42 | number 9 | september 2012 | 797 [ RESEARCH REPORT ] D espite the high cost associated with the fluctuating clinical course of low back pain (LBP), no treatment strategy surgical or conservative has been shown to be consistently effective in reducing the often persistent symptoms, functional limitations, and disability associated with this condi- tion. The lack of beneficial effects of con- servative treatments for LBP may be due to the lack of a pathoanatomy-specific di- agnosis, 27,30,31 as fewer than 20% of indi- viduals with LBP can be given a specific, structurally based diagnosis. 22 In the absence of a specific patho- anatomical diagnosis and to better direct treatment, a number of research and clin- ical groups have suggested that there is a need for a system that classifies individu- als with LBP based on key clinical symp- toms and multidimensional features of the LBP presentation. 1-3,8,9,14,29,35 The basis for this suggestion is that people with LBP represent a heterogeneous group, con- sisting of several smaller homogeneous subgroups. Logically, if the subgroups of patients were classified based on criteria relevant to their specific symptoms, these more homogeneous subgroups would have a higher likelihood of responding to matched treatment approaches. Such a classification system could be useful both in prognosis and treatment, rendering the development and testing of classifi- cation systems of LBP a top priority. 3,8,35 Numerous classification systems have been described for patients with LBP. 34 Delitto and colleagues 11 described a treatment-based classification (TBC) system and used information gathered from the patient history and physical ex- amination to place a patient into 1 of 4 classification categories that directed pa- tient treatment: manipulation, specific exercise, stabilization, and traction. The T T STUDY DESIGN: Observational, cross-sectional reliability study. T T OBJECTIVES: To examine the interrater reliabil- ity of novice raters in their use of the treatment- based classification (TBC) system for low back pain and to explore the patterns of disagreement in classification errors. T T BACKGROUND: Although the interrater reli- ability of individual test items in the TBC system is moderate to good, some error persists in clas- sification decision making. Understanding which classification errors are common could direct further refinement of the TBC system. T T METHODS: Using previously recorded patient data (n = 24), 12 novice raters classified patients according to the TBC schema. These classification results were combined with those of 7 other raters, allowing examination of the overall agreement using the kappa statistic, as well as agreement/ disagreement among pairwise comparisons in classification assignments. A chi-square test ex- amined differences in percent agreement between the novice and more experienced raters and differ- ences in classification distributions between these 2 groups of raters. T T RESULTS: Among 12 novice raters, there was 80.9% agreement in the pairs of classification (κ = 0.62; 95% confidence interval: 0.59, 0.65) and an overall 75.5% agreement (κ = 0.57; 95% confidence interval: 0.55, 0.69) for the combined data set. Raters were least likely to agree on a classification of stabilization (77.5% agreement). The overall percentage of pairwise classifica- tion judgments that disagreed was 24.5%, with the most common disagreement being between manipulation and stabilization (11.0%), followed by a mismatch between stabilization and specific exercise (8.2%). T T CONCLUSION: Additional refinement is needed to reduce rater disagreement that persists in the TBC decision-making algorithm, particularly in the stabilization category. J Orthop Sports Phys Ther 2012;42(9):797-805, Epub 7 June 2012. doi:10.2519/jospt.2012.4078 T T KEY WORDS: clinical decision making, lumbar spine, manipulation, stabilization 1 Professor, Department of Rehabilitation and Movement Science, University of Vermont, Burlington, VT. 2 Associate Professor, Division of Physical Therapy, University of Utah, Salt Lake City, UT; Clinical Outcomes Research Scientist, Intermountain Healthcare, Salt Lake City, UT. 3 Research Coordinator, Department of Rehabilitation and Movement Science, University of Vermont, Burlington, VT. 4 Research Associate Professor, Department of Mathematics and Statistics, University of Vermont, Burlington, VT. This research project was funded by the College of Nursing and Health Sciences Dean’s Research Incentive Fund, University of Vermont. The protocol and consent form were approved by the Institutional Review Board at the University of Vermont. Address correspondence to Dr Sharon M. Henry, Professor, Department of Rehabilitation and Movement Science, 305 Rowell Building, University of Vermont, Burlington, VT 05401. E-mail: [email protected] T Copyright ©2012 Journal of Orthopaedic & Sports Physical Therapy SHARON M. HENRY, PT, PhD 1 JULIE M. FRITZ, PT, PhD 2 ANDREA R. TROMBLEY, MPT 3 JANICE Y. BUNN, PhD 4 Reliability of a Treatment-Based Classification System for Subgrouping People With Low Back Pain

description

Article about classification of Low Back Pain

Transcript of Reliability of a Treatment-Based Classification System for Subgrouping People With Low Back Pain -...

Page 1: Reliability of a Treatment-Based Classification System for Subgrouping People With Low Back Pain - Julie Fritz

journal of orthopaedic & sports physical therapy | volume 42 | number 9 | september 2012 | 797

[ research report ]

Despite the high cost associated with the fluctuating clinical course of low back pain (LBP), no treatment strategy surgical or conservative has been shown to be consistently effective in reducing the often persistent symptoms, functional limitations,

and disability associated with this condi-tion. The lack of beneficial effects of con-servative treatments for LBP may be due

to the lack of a pathoanatomy-specific di-agnosis,27,30,31 as fewer than 20% of indi-viduals with LBP can be given a specific,

structurally based diagnosis.22

In the absence of a specific patho-anatomical diagnosis and to better direct treatment, a number of research and clin-ical groups have suggested that there is a need for a system that classifies individu-als with LBP based on key clinical symp-toms and multidimensional features of the LBP presentation.1-3,8,9,14,29,35 The basis for this suggestion is that people with LBP represent a heterogeneous group, con-sisting of several smaller homogeneous subgroups. Logically, if the subgroups of patients were classified based on criteria relevant to their specific symptoms, these more homogeneous subgroups would have a higher likelihood of responding to matched treatment approaches. Such a classification system could be useful both in prognosis and treatment, rendering the development and testing of classifi-cation systems of LBP a top priority.3,8,35

Numerous classification systems have been described for patients with LBP.34 Delitto and colleagues11 described a treatment-based classification (TBC) system and used information gathered from the patient history and physical ex-amination to place a patient into 1 of 4 classification categories that directed pa-tient treatment: manipulation, specific exercise, stabilization, and traction. The

TT STUDY DESIGN: Observational, cross-sectional reliability study.

TT OBJECTIVES: To examine the interrater reliabil-ity of novice raters in their use of the treatment-based classification (TBC) system for low back pain and to explore the patterns of disagreement in classification errors.

TT BACKGROUND: Although the interrater reli-ability of individual test items in the TBC system is moderate to good, some error persists in clas-sification decision making. Understanding which classification errors are common could direct further refinement of the TBC system.

TT METHODS: Using previously recorded patient data (n = 24), 12 novice raters classified patients according to the TBC schema. These classification results were combined with those of 7 other raters, allowing examination of the overall agreement using the kappa statistic, as well as agreement/disagreement among pairwise comparisons in classification assignments. A chi-square test ex-amined differences in percent agreement between the novice and more experienced raters and differ-

ences in classification distributions between these 2 groups of raters.

TT RESULTS: Among 12 novice raters, there was 80.9% agreement in the pairs of classification (κ = 0.62; 95% confidence interval: 0.59, 0.65) and an overall 75.5% agreement (κ = 0.57; 95% confidence interval: 0.55, 0.69) for the combined data set. Raters were least likely to agree on a classification of stabilization (77.5% agreement). The overall percentage of pairwise classifica-tion judgments that disagreed was 24.5%, with the most common disagreement being between manipulation and stabilization (11.0%), followed by a mismatch between stabilization and specific exercise (8.2%).

TT CONCLUSION: Additional refinement is needed to reduce rater disagreement that persists in the TBC decision-making algorithm, particularly in the stabilization category. J Orthop Sports Phys Ther 2012;42(9):797-805, Epub 7 June 2012. doi:10.2519/jospt.2012.4078

TT KEY WORDS: clinical decision making, lumbar spine, manipulation, stabilization

1Professor, Department of Rehabilitation and Movement Science, University of Vermont, Burlington, VT. 2Associate Professor, Division of Physical Therapy, University of Utah, Salt Lake City, UT; Clinical Outcomes Research Scientist, Intermountain Healthcare, Salt Lake City, UT. 3Research Coordinator, Department of Rehabilitation and Movement Science, University of Vermont, Burlington, VT. 4Research Associate Professor, Department of Mathematics and Statistics, University of Vermont, Burlington, VT. This research project was funded by the College of Nursing and Health Sciences Dean’s Research Incentive Fund, University of Vermont. The protocol and consent form were approved by the Institutional Review Board at the University of Vermont. Address correspondence to Dr Sharon M. Henry, Professor, Department of Rehabilitation and Movement Science, 305 Rowell Building, University of Vermont, Burlington, VT 05401. E-mail: [email protected] T Copyright ©2012 Journal of Orthopaedic & Sports Physical Therapy

SHARON M. HENRY, PT, PhD1 • JULIE M. FRITZ, PT, PhD2 • ANDREA R. TROMBLEY, MPT3 • JANICE Y. BUNN, PhD4

Reliability of a Treatment-Based Classification System for Subgrouping

People With Low Back Pain

42-09 Henry.indd 797 8/21/2012 4:18:26 PM

Page 2: Reliability of a Treatment-Based Classification System for Subgrouping People With Low Back Pain - Julie Fritz

798  |  september 2012  |  volume 42  |  number 9  |  journal of orthopaedic & sports physical therapy

[ research report ]

physical examination involved observa-tion of postural alignment, a neurologi-cal exam, and assessment of change in the patient’s symptoms with single and repeated spinal movements. Clusters of these clinical signs and symptoms were then associated with specific approaches for treatment to increase the likelihood of treatment success. Studies of people with acute12,17 and work-related LBP15,24 have guided the further refinement of the deci-sion-making process related to determin-ing the selection of 1 of the 4 treatment options for each patient.14 Although the reliability of clinicians in performing the individual exam items used in the clini-cal prediction rules has been shown to be fair to good,16,17,25 agreement among clini-cians to classify patients with LBP using the TBC system has only been tested in a preliminary manner.13,17,23

In a recent study, Stanton et al36 ex-amined the prevalence of patients meet-ing the criteria for each TBC treatment subgroup using 2 different methodolo-gies: individual subgroup criteria versus a comprehensive classification algorithm. Using patients with acute or subacute

(duration less than 90 days) LBP, the au-thors found that approximately 50% of the participants met the criteria for only 1 TBC subgroup. Twenty-five percent of the patients met the criteria for more than 1 subgroup, and the other 25% of the participants did not meet the criteria for any subgroup. Given that only 50% of the cases were classified in a mutually exclusive manner, further refinement of the TBC algorithm would be beneficial to guide clinicians in treatment selection for the other 50% of patients who did not meet any of the subgroup criteria or met more than 1 subgroup criterion.

To be useful for research or clinical practice, a classification system must demonstrate certain characteristics re-lated to reliability, feasibility, generaliz-ability, and various aspects of validity.6,34 Several of these characteristics have been investigated with respect to the TBC sys-tem. The classification categories appear to identify meaningful subgroups of pa-tients based on the results of randomized clinical trials comparing the outcomes of patients whose treatment is matched to their classification and those whose treat-

ment is unmatched.4,5,7,13 Specific criteria have been identified for the various clas-sification categories, with evidence of fair to good reliability among raters for many of these criteria.13,16,20,25 The over-all reliability of classification judgments has also been examined in several stud-ies. Initial studies reported percentage agreements ranging from 55% to 65%, with corresponding kappa values from 0.45 to 0.56.17,23,36 These results have led to the development of a more explicit decision-making algorithm.14 Fritz et al13 also examined the interrater reliability of classification decisions made with this al-gorithm using 7 therapists with varying levels of experience and found an overall agreement between therapists of 76%, with a kappa value of 0.60 (95% confi-dence interval [CI]: 0.56, 0.64), with no differences in reliability based on experi-ence. Recently, Stanton et al36 proposed a modified, comprehensive, hierarchical TBC algorithm that would also provide guidance for classifying patients who do not clearly meet the criteria set forth in the original algorithm. In this algorithm, when classifying the approximately 25% of patients who met more than 1 sub-group criterion, raters would need to use additional information outlining “factors favoring” and “factors against” classifica-tions in each treatment category. In the study, the reliability of the 2 novice rat-ers examining 32 patients was moder-ate, with a kappa value of 0.52 (95% CI: 0.27, 0.77) and a percentage agreement of 67%. Interestingly, following the first ex-aminer’s assessment, 38% of the patients had an unclear classification, whereas 61% had an unclear assessment following the second examiner’s assessment.

Although the interrater reliability of the classification judgments using the TBC system is moderate to good,36,37 it is clear that some degree of error persists in the classification decision making asso-ciated with the TBC system. Ideally, any individual with LBP should fit primarily 1 classification category. The complex and multidimensional clinical presentation of patients with LBP, however, results in

TABLE 1Characteristics of the 11 UVM Raters   

Who Were Physical Therapists*

Abbreviations: BS, bachelor of science; FTE, full-time equivalent; MPT, master of physical therapy; MSPT, master of science in physical therapy; TBC, treatment-based classification; UVM, University of Vermont.*The 12th rater from UVM was not a physical therapist.†Values are mean SD (range).

Characteristic Value

Entry-level physical therapy education, n

BS 8

MPT or MSPT 3

Years in practice, y† 13.3 6.2 (4-27)

Percent FTE of time in physical therapy practice, n

0% to 25% 3

26% to 50% 1

51% to 75% 1

76% to 100% 6

Previous experience with the TBC system, n

None 5

A little 6

Moderate 0

A lot 0

42-09 Henry.indd 798 8/21/2012 4:18:27 PM

Page 3: Reliability of a Treatment-Based Classification System for Subgrouping People With Low Back Pain - Julie Fritz

journal of orthopaedic & sports physical therapy | volume 42 | number 9 | september 2012 | 799

some inevitable overlap between catego-ries. Stanton et al36 reported that approx-imately 25% of the patients examined were classified in more than 1 category, with overlap between the manipulation and the specific exercise categories being the most common (68.4%). Identifying which categories overlap most commonly may assist in refining the decision-mak-ing algorithm and ultimately improve the usefulness of the TBC system. Thus, the purpose of this study was to examine the interrater reliability of applying the clas-sification criteria of the TBC system, and to explore the pattern of disagreements that may exist, to determine which cat-egories are most difficult to distinguish.

METHODS

Subjects

The subjects in this study were 12 raters from the University of Vermont (UVM), 11 of whom were

licensed physical therapists. The rat-ers completed a questionnaire to assess their education level, credentials, physi-cal therapy experience, and familiar-ity with the TBC schema (TABLE 1). The physical therapists had an average of 13.3 (range, 4-27) years of practice, and all practiced in an outpatient orthopaedic setting. None had completed a residency program or had an American Physical Therapy Association specialty certifica-tion. The 12th rater was a neuroscientist who specialized in posture control. All raters were informed of the experimen-tal protocol and the potential risks of the study, and gave written consent prior to their participation. The protocol and con-sent form were previously approved by the University of Vermont Institutional Review Board.

Materials and Patient Data CollectionThe 12 raters were instructed in the use of a published algorithm13 (FIGURE) to de-termine patient classification according to the TBC schema, using patient data from a previously published, random-ized clinical trial.4 Thus, the raters in the

current study did not actually collect the patient data. The previously published, randomized clinical trial was designed to identify some subgroups of patients with LBP; however, the traction subgroup was not considered in this trial,4 because pa-tients with signs of nerve root compres-sion were excluded. Thus, the traction category of the TBC algorithm was not included in the current study.

In the previously published trial,4 patients with LBP who qualified for in-clusion were between the ages of 18 and

65 years. They were referred to physi-cal therapy for their LBP symptoms, the duration of which had to be less than 90 days. The patients also had to have a modified Oswestry Disability Question-naire18 score of greater than 25% to be included. Potential patient participants were excluded if they had any history of surgery to the lumbosacral region, were pregnant, had a visible lateral shift or acute kyphotic deformity, had no repro-duction of symptoms with lumbar range of motion or palpation, or had signs of

Specific Exercise Classification

Does the patient:1. Centralize with 2 or more movements in the same directions (ie, flexion or extension) OR2. Centralize with a movement in 1 direction and peripheralize with an opposite movement

ManipulationMobilizationClassification

Does the patient:1. Have a recent onset of symptoms (�16 d) AND2. No symptoms distal to the knee

Stabilization Classification

Does the patient have at least 3 of the following:1. Average SLR ROM �91°2. Positive prone instability test3. Positive aberrant movements4. Age �40 y

Yes

Yes

No

No

No

Yes

Which subgroup does the patient best fit?

Factors Favoring

• More recent onset (<16 days)• LBP only (no distal symptoms)• Low FABQ scores (FABQW score �19)

Factors Against

• Symptoms below the knee• Increasing episode frequency• Peripheralization with motion testing• No pain with spring testing

Factors Favoring

• Hypermobility with spring testing• Aberrant motions present• Increasing episode frequency• Younger age (<40 years)• 3 or more prior episodes• Greater SLR (<91° bilateral) ROM

Factors Against

• Discrepancy in SLR ROM (�10°)• Low FABQ scores (FABQPA score �9)

Factors Favoring

• Preference for one posture• Centralization with motion testing• Peripheralization in direction opposite centralization

Factors Against

• LBP only (no distal symptoms)• Status quo with all movements

Mobilization/Manipulation Stabilization Specific Exercise

FIGURE. Treatment-based classification decision-making algorithm used by the 19 raters. Abbreviations: FABQ, Fear-Avoidance Beliefs Questionnaire; FABQPA, FABQ physical activity subscale; FABQW, FABQ work subscale; LBP, low back pain; ROM, range of motion; SLR, straight leg raise. Adapted with permission from Fritz et al.13

42-09 Henry.indd 799 8/21/2012 4:18:28 PM

Page 4: Reliability of a Treatment-Based Classification System for Subgrouping People With Low Back Pain - Julie Fritz

800  |  september 2012  |  volume 42  |  number 9  |  journal of orthopaedic & sports physical therapy

[ research report ]nerve root compression (positive straight leg raise test and/or lower-limb reflex or strength deficits).

Demographic information (age, sex, prior history of LBP, aggravating and relieving factors, and the duration and location [low back only, below the but-tock, or below the knee] of current symp-toms) was collected, as well as pain rating (11-point numeric pain rating scale26) and fear avoidance data using the Fear-Avoidance Beliefs Questionnaire,40 which includes work and physical activity sub-scales. The physical examination included range-of-motion measurements with in-clinometers for total lumbar spine flexion and extension and straight leg raise.19,41 Centralization or peripheralization of low back symptoms with lumbar movement was recorded. Patients were also asked to perform 10 repetitions of trunk exten-sion movements in standing and trunk flexion movements in sitting, as well as to hold a trunk extension position for 30 seconds in prone lying. Centralization or peripheralization of low back symptoms was also recorded for each of these move-ment tests.16 The presence of any aberrant movement patterns during trunk flexion/extension movements was noted and the prone instability test was performed as described previously.25 Lumbar interver-tebral mobility (normal, hypomobile, hy-permobile) was assessed with the patient in prone lying, by applying a posterior-to-anterior force over each lumbar spinous process. The presence of pain, either lo-cal (directly under the therapist’s hand) or distal, with each mobility assessment was also recorded.

The previously published, random-ized clinical trial4 included 123 patients with LBP, from which 24 cases were randomly selected for inclusion in this reliability study. The questionnaire and clinical exam data for each patient with LBP were transcribed onto a standard-ized, 2-page clinical examination form. These data were then used by the 12 rat-ers in the current study to classify each patient using the TBC decision-making algorithm (FIGURE).

ProceduresDuring a brief training session (2 hours), the 12 UVM raters were oriented to the goals of the study, the paperwork, the TBC schema, and the decision-making algorithm (FIGURE). Several practice cases (different from those used in this study) were used and discussed to familiarize the raters with the data and the algo-rithm. Following that training, each of the 12 raters was given 24 clinical exami-nation forms that included the patients’ history and physical exam data and a re-cording form on which to record the clas-sification choice. Working independently, each rater was instructed to use the infor-mation on the clinical examination form to assign 1 of 3 classification categories

(manipulation, specific exercise, or sta-bilization) to each of the 24 cases using the decision-making algorithm (FIGURE). Raters were blind to the judgments of the other raters in this study and to those made in the original clinical trial. Only 1 classification judgment was permitted per subject, and each rater was instructed to submit a classification judgment on each of the 24 patients.

To examine which classification judg-ments were most difficult to distinguish from one another and to examine the interrater reliability of the TBC schema, the ratings of the 12 UVM raters were combined with those of 7 more experi-enced, expert (EXP) physical therapy raters from a previous study.13 Two of the

TABLE 2Descriptive Characteristics of the 24 

Patients With LBP Using the Treatment- Based Classification Approach*

Abbreviations: FABQ, Fear-Avoidance Beliefs Questionnaire; LBP, low back pain.*Values are mean SD unless otherwise indicated.

Characteristic Value

Age, y 39.2 11.4

Duration of symptoms (median [range]), d 20 (1-90)

Sex (female), % 50%

Numeric pain rating scale (0-10) 5.6 1.5

Oswestry Disability Questionnaire score (0%-100%) 40.2% 11.5%

FABQ work subscale (0-42) 11.5 8.7

FABQ physical activity subscale (0-24) 16.4 5.6

Symptoms distal to buttock (yes), % 41.7%

Prior history of LBP (yes), % 70.8%

TABLE 3Percent Agreement Among the   Raters and for the Categories

Abbreviations: CI, confidence interval; EXP, more experienced; UVM, University of Vermont.*Significantly higher than the EXP group (P<.01).

UVM Raters (n = 12) EXP Raters (n = 7) All Raters (n = 19)

Percent agreement for raters

Kappa (95% CI) 0.62 (0.59, 0.65) 0.47 (0.41, 0.54) 0.57 (0.55, 0.69)

Overall percentage agreement 80.9%* 68.5% 75.5%

Kappa (95% CI) by classification category for each rater group

Specific exercise 0.64 (0.48, 0.80) 0.60 (0.35, 0.85) 0.61 (0.49, 0.72)

Manipulation 0.73 (0.49, 0.98) 0.54 (0.19, 0.90) 0.66 (0.47, 0.84)

Stabilization 0.57 (0.40, 0.74) 0.42 (0.15, 0.69) 0.49 (0.36, 0.62)

42-09 Henry.indd 800 8/21/2012 4:18:29 PM

Page 5: Reliability of a Treatment-Based Classification System for Subgrouping People With Low Back Pain - Julie Fritz

journal of orthopaedic & sports physical therapy | volume 42 | number 9 | september 2012 | 801

7 raters were considered experts in the TBC schema, 3 were experienced physi-cal therapists, and 2 were novice physi-cal therapists. These 7 EXP raters rated the same 24 cases that were rated by the UVM raters. The purpose of combining the 19 raters’ data was to increase the number of raters and to compare their ability to apply the TBC decision-making algorithm to an existing data set.

Data AnalysisDescriptive characteristics were exam-ined in the 19 raters and 24 patients with LBP. To assess interrater reliability of the TBC schema, overall percentage agree-ment and kappa coefficients with 95% CIs, as well as the distribution of classi-fication judgments, were computed and compared among the 12 UVM raters and the 7 EXP raters. A chi-square test was used to test for a difference in percent-age agreement between UVM and EXP raters, as well as to examine differences in classification distributions between UVM and EXP raters. To explore the nature of the disagreements in the clas-sification decisions, we examined the per-centage of judgments in agreement and those in disagreement compared to the percentage of the majority of judgments in each classification category. For each patient with LBP included in the analy-sis, we considered all raters’ judgments and categorized them as in agreement (ie, the same classification judgment compared to the majority of raters) or in disagreement (ie, different classification

judgments compared to the majority of raters). For the single case in which there was a tie in the judgments, the case was classified according to the judgment of the 2 expert raters. Finally, we examined classification judgments that were in dis-agreement pairwise, counting each pair only once, and categorized them into 1 of 3 possible categories of disagreement: manipulation-stabilization, manipula-tion-specific exercise, or stabilization-specific exercise. The frequency of each disagreement category was calculated and a chi-square test was done for each of 3 possible categories of disagreement to detect significant differences between the UVM and the EXP rater groups.

RESULTS

The descriptive characteristics of the 19 raters and the 24 subjects with LBP are provided in TABLES 1

and 2, respectively. Of the 24 patients, the 2 expert raters determined that 5 cases should be classified as specific ex-ercise, 13 cases as manipulation, and 6 as stabilization.

UVM Raters OnlyOf the possible 288 classifications (24 patients × 12 raters), there were a total of 277 classification judgments available for analysis, thus 1464 pairwise comparisons, using each pair only once, among the 12 raters’ chosen classification. There was an overall 80.9% agreement in the pairs of classification, with a kappa coefficient of

0.62 (95% CI: 0.59, 0.65) (TABLE 3). When examining rater agreement for a particu-lar classification category, the kappa coef-ficient was 0.64 (95% CI: 0.48, 0.80) for the specific exercise category, 0.73 (95% CI: 0.49, 0.98) for the manipulation cat-egory, and 0.57 (95% CI: 0.40, 0.74) for the stabilization category.

Combined Data Across 19 RatersData from the 12 UVM raters were combined with those of 7 EXP raters, resulting in a total of 445 classification judgments made for 24 patients, for a total of 3907 pairwise combinations of rater classification judgments. Overall, 24.0% of the classification judgments were specific exercise, 48.3% were ma-nipulation, and 27.6% were stabilization (TABLE 4). There was no difference in the distribution of classifications among rat-ers from UVM or EXP (chi-square P = .92).

Of the 3907 pairwise combinations of rater classification judgments, 2951 pairs were in agreement, for an overall agree-ment of 75.5% with a kappa coefficient of 0.57 (95% CI: 0.55, 0.69) (TABLE 3). The overall agreement for the EXP raters was 68.5% with a kappa coefficient of 0.47 (95% CI: 0.41, 0.54) (TABLE 3). The per-centage agreement (80.9%) among the UVM raters noted above was greater than that of the EXP raters (P<.01). When ex-amining agreement by a particular clas-sification category for all 19 raters, the kappa coefficient was 0.61 (95% CI: 0.49, 0.72) for the specific exercise category, 0.66 (95% CI: 0.47, 0.84) for the ma-nipulation category, and 0.49 (95% CI: 0.36, 0.62) for the stabilization category.

Additionally, we examined the fre-quency of agreement versus disagree-ment judgments compared to the majority judgments for each classification category (TABLE 5). Raters were most likely to agree on the classification of specific exercise. When the majority judgment was specific exercise, 85.9% of the indi-vidual raters made this judgment. Raters were only slightly less likely to agree on a classification of manipulation (84.3%),

TABLE 4Distribution of Classifications

for Each Group of Raters*†

Abbreviations: EXP, more experienced; UVM, University of Vermont.*Values are n (%).†Each rater classified 24 patients for a total of 277 judgments made for the UVM raters, 168 judgments for the EXP raters, and 445 judgments overall.‡There was no difference in the distribution of classifications among raters from UVM or EXP (chi-square P = .92).

UVM Raters (n = 12)‡ EXP Raters (n = 7) All Raters (n = 19)

Specific exercise 68 (24.5%) 39 (23.2%) 107 (24.0%)

Manipulation 134 (48.4%) 81 (48.2%) 215 (48.3%)

Stabilization 75 (27.1%) 48 (28.6%) 123 (27.6%)

42-09 Henry.indd 801 8/21/2012 4:18:30 PM

Page 6: Reliability of a Treatment-Based Classification System for Subgrouping People With Low Back Pain - Julie Fritz

802  |  september 2012  |  volume 42  |  number 9  |  journal of orthopaedic & sports physical therapy

[ research report ]

and were least likely to agree on a clas-sification of stabilization (77.5%). Thus, the stabilization category had the highest percentage of judgments that disagreed (22.5%), followed by the manipulation category (15.7%) and, last, the specific exercise category (14.2%) (TABLE 5).

To explore the nature of the disagree-ments in the classification decisions, we examined the percentage of disagree-ments in pairwise raters’ judgments for each of 3 possible classification dis-agreements (manipulation-stabilization, manipulation-specific exercise, or stabi-lization-specific exercise) for the 19 rat-ers. The overall percentage of pairwise

classification judgments that disagreed was 24.5% (TABLE 6). Of the 24.5% of the discordant pairwise classification judg-ments, the most common disagreement occurred with 1 rater making a judgment of manipulation and the other rater mak-ing a judgment of stabilization (11.0%), followed by the disagreement between the stabilization and specific exercise categories (8.2%). The least common pairwise disagreement occurred be-tween judgments of manipulation and specific exercise (5.3%). It is important to note that in 4 patients the expert rat-ers were not in agreement with the ma-jority of classifications. Three of these 4

disagreements involved a manipulation-stabilization mismatch. The distribution of the pairs that disagreed significantly differed between UVM and EXP (chi-square P<.01). Post hoc comparisons for the individual categories revealed that the UVM raters had similar propor-tions of manipulation-stabilization and specific exercise-stabilization disagree-ments, whereas EXP raters had more manipulation-stabilization disagree-ments and comparatively fewer specific exercise-stabilization disagreements. For both groups, the fewest disagreements occurred between the specific exercise and manipulation categories.

DISCUSSION

This study examined the interra-ter reliability of applying the classi-fication criteria of the TBC system

to clinical data that were previously col-lected, and also explored the pattern of disagreements in classifications to deter-mine which categories were most difficult to distinguish. In some instances, the agreement among the more novice UVM raters was comparable to or better than that among the EXP raters.

The percentage of disagreements in classification judgments, compared to majority ratings, was highest in the sta-bilization category, involving mismatches with either the specific exercise or manip-ulation category.

Interrater Reliability for TBC ClassificationThe 12 UVM raters had little to no prior experience with the TBC schema and, with minimal training, were able to demonstrate good36 interrater reliability in applying the classification criteria of the TBC system. These results, similar to those reported by others,13,23,36 dem-onstrate the ease of learning this classi-fication system and its potential clinical utility. Interestingly, the UVM rater who was not a licensed physical therapist had the highest percentage agreement with the majority rating, lending further

TABLE 5

Distribution of Classification Judgments   and the Frequency of Agreement   Versus Disagreement Compared   

to the Majority Judgments for Each Classification Category (n = 19 Raters)

*Values are n (%).

Distribution of Classification Judgments*

Majority Classification Category

Patients per Expert Rater, n

Classification Judgments, n

Specific Exercise Manipulation Stabilization

Specific exercise 5 92 79 (85.9%) 2 (2.2%) 11 (11.9%)

Manipulation 13 242 12 (5.0%) 204 (84.3%) 26 (10.7%)

Stabilization 6 111 16 (14.4%) 9 (8.1%) 86 (77.5%)

TABLE 6

Percentage of Discordant Pairwise   Raters’ Judgments Compared to   

the Majority Judgments for Each of 3 Possible Classification Disagreements

Abbreviations: EXP, more experienced; UVM, University of Vermont.*Significantly different proportions from EXP group (chi-square P = .01); post hoc comparisons for the individual categories revealed that the UVM raters had similar proportions of manipulation-stabili-zation and specific exercise-stabilization disagreements, whereas EXP raters had more manipulation-stabilization disagreements and comparatively fewer specific exercise-stabilization disagreements.

UVM Raters (n = 12) EXP Raters (n = 7) All Raters (n = 19)

Overall percentage of pairwise judgments that disagreed

19.1% 31.5% 24.5%

Percentage of manipulation-specific exercise pairwise judgments that disagreed

3.9%* 7.3% 5.3%

Percentage of manipulation-stabilization pairwise judgments that disagreed

7.7%* 16.5% 11.0%

Percentage of specific exercise-stabilization pairwise judgments that disagreed

7.5%* 7.7% 8.2%

42-09 Henry.indd 802 8/21/2012 4:18:31 PM

Page 7: Reliability of a Treatment-Based Classification System for Subgrouping People With Low Back Pain - Julie Fritz

journal of orthopaedic & sports physical therapy | volume 42 | number 9 | september 2012 | 803

support to the ease of learning the TBC system.

Other studies have used a similar methodology to that of the present study, using written patient cases rather than actual patients. Dankaerts et al10 distrib-uted 25 cases (patients’ subjective infor-mation and videotaped functional tests) to 13 clinicians, who independently clas-sified each case using particular pieces of the history and physical examination on a mechanism-based classification system for nonspecific LBP.32 Kappa co-efficients ranged from 0.47 to 0.80, and the percentage agreement among raters was 70% (range, 60%-84%), indicating moderate reliability for this classification approach. The use of paper cases removes additional confounding factors, such as exam performance, that may influence reliability outcomes and thus allows the examination of the classification algo-rithm itself.

Other studies, in contrast, have had raters examine patients repeat-edly.13,21,23,33,36,38,39,42 The same patients were examined by different clinicians to determine if they would obtain similar clinical results and assign each patient to the same classification category. Physical therapists with varying levels of experi-ence used the TBC schema and agreed on the categorization of 67% to 79% of mechanical cases.13,36 In another classi-fication schema that used pain pattern recognition as the basis for subgrouping of patients with LBP (n = 204),42 paired physical therapists performed indepen-dent examinations on each patient and assigned the patient to 1 of 5 pain pat-terns. Agreement on patient classification by independent examiners was 78.9% (κ = 0.61). Although this methodol-ogy may simulate actual clinical decision making more closely than paper cases do, it does not allow direct testing of mutual exclusivity of the subgroups within the algorithm. If the reliability of a classifi-cation schema is shown to be poor and there is a high percentage of disagree-ments in classification judgments, one does not know if this is due to a lack of

robustness of the algorithm or to poor reliability of any number of individual test items that are used in the algorithm to make the classification determination.

Patterns of Classification DisagreementsIn our study, there was an overall per-centage of discordant pairwise judg-ments of 24.5%, indicating that some patients with LBP are difficult to catego-rize with the classification decision mak-ing associated with the TBC system. This prevalence is similar to that reported by Stanton et al,36 which indicated that 25% of the patients examined met the sub-group criteria for more than 1 subgroup. Our results point to 2 more prevalent patterns of disagreement involving the stabilization category: stabilization-ma-nipulation and stabilization-specific exer-cise. In contrast, Stanton et al36 reported that the most common subgroup combi-nation for patients who met the criteria for 2 subgroups was manipulation and specific exercise. One possible reason for this difference is that in the current study, the sampling of patients in the spe-cific exercise subgroup (according to the expert raters) was the lowest (n = 5) of the 3 subgroups sampled. It is also im-portant to note that there were 4 patients about whom the expert raters were not in agreement with the majority of classifi-cations. This may reflect the greater dif-ficulty of classifying the cases chosen for the current study and a lack of sufficient guidance in using the algorithm14 in these more challenging cases.

Raters reported that in about 50% of the patients, assignment of the ap-propriate category was clear and they were confident in their selection. For the other half of the patients, they reported needing to use the lower portion of the decision-making algorithm, in which they had to weigh factors favoring versus factors against a particular classification category (FIGURE). This made the pattern recognition for distinguishing categories more difficult, and raters reported being less confident in their category selection. In these instances, the decision-making

process clearly differed among the raters, as each weighted these factors differently, likely based on their own biases or previ-ous clinical experience. Similarly, Stanton et al36 reported that the 2 novice raters who participated in the reliability portion of their study had different percentages of unclear assessments (38% of patients for the first rater and 61% of patients for the second rater), which necessitated that the raters use the additional classification criteria in the bottom table of the algo-rithm used in that study.

The implication of the stabilization-manipulation and stabilization-specific exercise mismatches is that, clinically, patients may not be achieving optimal treatment outcomes because of misclas-sification. Although achieving 100% sen-sitivity and specificity in a classification algorithm is unrealistic, caution must be taken when using a classification algo-rithm that lacks mutual exclusivity too prematurely, as there is a risk of reaching incorrect conclusions regarding the effec-tiveness of the treatment tested28 and the clinical features that determine patient subgrouping. Our results suggest that ad-ditional refinement may be necessary for the stabilization category to distinguish the clinical features associated with sta-bilization treatment success from those of the specific exercise or manipulation categories.

Study LimitationsOne limitation of the study involves the sampling of cases. The 3 categories exam-ined in this study were not represented equally, and the traction category, the fourth possible TBC category, was not in-cluded. Based on the expert raters’ judg-ment, as well as the majority rating, the sampling was biased toward the manipu-lation category. Given that percentage of agreement is influenced by prevalence of categories, raters had more opportunities to correctly (or incorrectly) classify cases for the manipulation category, and this is reflected in the highest kappa (0.66) for this category in the combined data. That said, the specific exercise category

42-09 Henry.indd 803 8/21/2012 4:18:32 PM

Page 8: Reliability of a Treatment-Based Classification System for Subgrouping People With Low Back Pain - Julie Fritz

804  |  september 2012  |  volume 42  |  number 9  |  journal of orthopaedic & sports physical therapy

[ research report ]had a kappa of 0.61 and had the lowest sample, based on the EXP raters’ judg-ments as well as the majority rating. Al-though the sample of the present study was adequate, a larger sample could have provided a greater diversity of cases, thereby increasing the generalizability of the results.

Even though we used raters from 4 different clinical sites, the raters did not examine the patients themselves, and we must acknowledge the possible er-ror in original collection of data from the previous study.4 The raters based their classification judgments solely on data previously recorded by a physical thera-pist. Although this removed 1 source of potential error (ie, inconsistency of clini-cal exam performance), it did not repli-cate the decision making that occurs in the clinic. Thus, other factors that impact clinical decision making, which might have improved or worsened the raters’ classification success, cannot be taken into account. However, this study design allowed us to examine the reliability and the pattern of disagreements that existed when using the TBC algorithm.

CONCLUSION

The TBC algorithm appears to be easy to learn for novice users. With minimal training, novice raters were

able to apply the algorithm in classifica-tion judgments with good interrater re-liability. However, error persisted in the classification decision making associ-ated with the TBC system, in particular for the stabilization category. Additional refinement through the identification of additional and/or novel patient charac-teristics is needed to improve the clinical utility of the TBC schema. t

KEY POINTSFINDINGS: Raters were most likely to agree on the classification of specific exercise and least likely to agree on a classification of stabilization. The most common disagreement occurred with 1 rater making a judgment of manipula-

tion and the other rater making a judg-ment of stabilization. The least common disagreement occurred between judg-ments of manipulation and specific exercise.IMPLICATIONS: The issue of mutual exclu-sivity for the 3 TBC categories studied needs to be further addressed through refinement of the decision-making al-gorithm. Clinical application of the TBC algorithm for patients considered in the stabilization category should be judi-cious, as this category had the highest percentage of classification mismatches among raters.CAUTION: Raters were only provided in-formation about the patient on paper and did not conduct the clinic exam themselves. Thus, other factors impact-ing clinical decision making that might have improved or worsened the raters’ classification success were not taken into account.

ACKNOWLEDGEMENTS: The authors would like to acknowledge the physical therapy practices that participated in this study: Dee Physi-cal Therapy, South Burlington, VT; Evolu-tion Physical Therapy and Yoga, Burlington, VT; Fletcher Allen Health Care, Burlington, VT; and Timberlane Physical Therapy, South Burlington, VT.

REFERENCES

1. Atlas SJ, Deyo RA, Patrick DL, Convery K, Keller RB, Singer DE. The Quebec Task Force classification for spinal disorders and the sever-ity, treatment, and outcomes of sciatica and lumbar spinal stenosis. Spine (Phila Pa 1976). 1996;21:2885-2892.

2. Billis EV, McCarthy CJ, Oldham JA. Subclas-sification of low back pain: a cross-country com-parison. Eur Spine J. 2007;16:865-879. http://dx.doi.org/10.1007/s00586-007-0313-2

3. Borkan JM, Cherkin DC. An agenda for primary care research on low back pain. Spine (Phila Pa 1976). 1996;21:2880-2884.

4. Brennan GP, Fritz JM, Hunter SJ, Thackeray A, Delitto A, Erhard RE. Identifying subgroups of patients with acute/subacute “nonspecific” low back pain: results of a randomized clinical trial. Spine (Phila Pa 1976). 2006;31:623-631. http://dx.doi.org/10.1097/01.brs.0000202807.72292.a8

5. Browder DA, Childs JD, Cleland JA, Fritz JM. Ef-fectiveness of an extension-oriented treatment approach in a subgroup of subjects with low back pain: a randomized clinical trial. Phys Ther. 2007;87:1608-1618. http://dx.doi.org/10.2522/ptj.20060297

6. Buchbinder R, Goel V, Bombardier C, Hogg-Johnson S. Classification systems of soft tissue disorders of the neck and upper limb: do they satisfy methodological guidelines? J Clin Epide-miol. 1996;49:141-149.

7. Childs JD, Fritz JM, Flynn TW, et al. A clinical prediction rule to identify patients with low back pain most likely to benefit from spinal ma-nipulation: a validation study. Ann Intern Med. 2004;141:920-928.

8. Croft P, Papageorgious A, McNally R. Low back pain. In: Stevens A, Raftery J, eds. Health Care Needs Assessment: The Epidemiologically Based Needs Assessment Reviews. New York, NY: Radcliffe Medical Press; 1997:129-181.

9. Dankaerts W, O’Sullivan P. The validity of O’Sullivan’s classification system (CS) for a sub-group of NS-CLBP with motor control impair-ment (MCI): overview of a series of studies and review of the literature. Man Ther. 2011;16:9-14. http://dx.doi.org/10.1016/j.math.2010.10.006

10. Dankaerts W, O’Sullivan PB, Straker LM, Burnett AF, Skouen JS. The inter-examiner reliability of a classification method for non-specific chronic low back pain patients with motor control impairment. Man Ther. 2006;11:28-39. http://dx.doi.org/10.1016/j.math.2005.02.001

11. Delitto A, Erhard RE, Bowling RW. A treatment-based classification approach to low back syndrome: identifying and staging patients for conservative treatment. Phys Ther. 1995;75:470-485; discussion 485-489.

12. Flynn T, Fritz J, Whitman J, et al. A clinical prediction rule for classifying patients with low back pain who demonstrate short-term improve-ment with spinal manipulation. Spine (Phila Pa 1976). 2002;27:2835-2843. http://dx.doi.org/10.1097/01.BRS.0000035681.33747.8D

13. Fritz JM, Brennan GP, Clifford SN, Hunter SJ, Thackeray A. An examination of the reliability of a classification algorithm for subgrouping pa-tients with low back pain. Spine (Phila Pa 1976). 2006;31:77-82. http://dx.doi.org/10.1097/01.brs.0000193898.14803.8a

14. Fritz JM, Cleland JA, Childs JD. Subgrouping patients with low back pain: evolution of a clas-sification approach to physical therapy. J Orthop Sports Phys Ther. 2007;37:290-302. http://dx.doi.org/10.2519/jospt.2007.2498

15. Fritz JM, Delitto A, Erhard RE. Comparison of classification-based physical therapy with therapy based on clinical practice guidelines for patients with acute low back pain: a ran-domized clinical trial. Spine (Phila Pa 1976). 2003;28:1363-1371; discussion 1372. http://dx.doi.org/10.1097/01.BRS.0000067115.61673.FF

16. Fritz JM, Delitto A, Vignovic M, Busse RG. Inter-

42-09 Henry.indd 804 8/21/2012 4:18:33 PM

Page 9: Reliability of a Treatment-Based Classification System for Subgrouping People With Low Back Pain - Julie Fritz

journal of orthopaedic & sports physical therapy | volume 42 | number 9 | september 2012 | 805

MORE INFORMATIONWWW.JOSPT.ORG@

rater reliability of judgments of the centraliza-tion phenomenon and status change during movement testing in patients with low back pain. Arch Phys Med Rehabil. 2000;81:57-61.

17. Fritz JM, George S. The use of a classification approach to identify subgroups of patients with acute low back pain. Interrater reliability and short-term treatment outcomes. Spine (Phila Pa 1976). 2000;25:106-114.

18. Fritz JM, Irrgang JJ. A comparison of a modified Oswestry Low Back Pain Disability Question-naire and the Quebec Back Pain Disability Scale. Phys Ther. 2001;81:776-788.

19. Fritz JM, Piva SR. Physical impairment index: re-liability, validity, and responsiveness in patients with acute low back pain. Spine (Phila Pa 1976). 2003;28:1189-1194. http://dx.doi.org/10.1097/01.BRS.0000067270.50897.DB

20. Fritz JM, Piva SR, Childs JD. Accuracy of the clinical examination to predict radiographic instability of the lumbar spine. Eur Spine J. 2005;14:743-750. http://dx.doi.org/10.1007/s00586-004-0803-4

21. Harris-Hayes M, Van Dillen LR. The inter-tester reliability of physical therapists classifying low back pain problems based on the movement system impairment classification system. PM R. 2009;1:117-126. http://dx.doi.org/10.1016/j.pmrj.2008.08.001

22. Hart LG, Deyo RA, Cherkin DC. Physician of-fice visits for low back pain. Frequency, clinical evaluation, and treatment patterns from a U.S. national survey. Spine (Phila Pa 1976). 1995;20:11-19.

23. Heiss DG, Fitch DS, Fritz JM, Sanchez WJ, Roberts KE, Buford JA. The interrater reliability among physical therapists newly trained in a classification system for acute low back pain. J Orthop Sports Phys Ther. 2004;34:430-439. http://dx.doi.org/10.2519/jospt.2004.1555

24. Hicks GE, Fritz JM, Delitto A, McGill SM. Prelimi-nary development of a clinical prediction rule for determining which patients with low back pain will respond to a stabilization exercise program.

Arch Phys Med Rehabil. 2005;86:1753-1762. http://dx.doi.org/10.1016/j.apmr.2005.03.033

25. Hicks GE, Fritz JM, Delitto A, Mishock J. Inter-rater reliability of clinical examination measures for identification of lumbar segmental instability. Arch Phys Med Rehabil. 2003;84:1858-1864.

26. Jensen MP, Turner JA, Romano JM. What is the maximum number of levels needed in pain intensity measurement? Pain. 1994;58:387-392.

27. Kent P, Keating JL. Classification in nonspecific low back pain: what methods do primary care clinicians currently use? Spine (Phila Pa 1976). 2005;30:1433-1440.

28. Konitzer LN, Gill NW, Koppenhaver SL. Investiga-tion of abdominal muscle thickness changes after spinal manipulation in patients who meet a clinical prediction rule for lumbar stabilization. J Orthop Sports Phys Ther. 2011;41:666-674. http://dx.doi.org/10.2519/jospt.2011.3685

29. Moffroid MT, Haugh LD, Henry SM, Short B. Distinguishable groups of musculoskeletal low back pain patients and asymptomatic control subjects based on physical measures of the NIOSH Low Back Atlas. Spine (Phila Pa 1976). 1994;19:1350-1358.

30. Mooney V. The classification of low back pain. Ann Med. 1989;21:321-325.

31. Nordin M, Weiser S, van Doorn JW, Hiebert R, Rom WN. Nonspecific low back pain. In: Rom WN, ed. Environmental and Occupational Medi-cine. 3rd ed. Philadelphia, PA: Lippincott-Raven Publishers; 1998:947-957.

32. O’Sullivan P. Diagnosis and classification of chronic low back pain disorders: maladap-tive movement and motor control impair-ments as underlying mechanism. Man Ther. 2005;10:242-255. http://dx.doi.org/10.1016/j.math.2005.07.001

33. Petersen T, Olsen S, Laslett M, et al. Inter-tester reliability of a new diagnostic classification system for patients with non-specific low back pain. Aust J Physiother. 2004;50:85-94.

34. Riddle DL. Classification and low back pain: a review of the literature and critical analysis of

selected systems. Phys Ther. 1998;78:708-737. 35. Spitzer WO, Quebec Task Force on Spinal Disor-

ders. Scientific approach to the assessment and management of activity-related spinal disorders: a monograph for clinicians. Spine. 1987;12 suppl:S5-S59.

36. Stanton TR, Fritz JM, Hancock MJ, et al. Evalu-ation of a treatment-based classification algo-rithm for low back pain: a cross-sectional study. Phys Ther. 2011;91:496-509. http://dx.doi.org/10.2522/ptj.20100272

37. Streiner DL, Norman GR. Health Measurement Scales. A Practical Guide to Their Development and Use. 2nd ed. Oxford, UK: Oxford University Press; 1995.

38. Trudelle-Jackson E, Sarvaiya-Shah SA, Wang SS. Interrater reliability of a movement impairment-based classification system for lumbar spine syndromes in patients with chronic low back pain. J Orthop Sports Phys Ther. 2008;38:371-376. http://dx.doi.org/10.2519/jospt.2008.2760

39. Vibe Fersum K, O’Sullivan PB, Kvåle A, Skouen JS. Inter-examiner reliability of a classification system for patients with non-specific low back pain. Man Ther. 2009;14:555-561. http://dx.doi.org/10.1016/j.math.2008.08.003

40. Waddell G, Newton M, Henderson I, Somerville D, Main CJ. A Fear-Avoidance Beliefs Question-naire (FABQ) and the role of fear-avoidance beliefs in chronic low back pain and disability. Pain. 1993;52:157-168.

41. Waddell G, Somerville D, Henderson I, Newton M. Objective clinical evaluation of physical im-pairment in chronic low back pain. Spine (Phila Pa 1976). 1992;17:617-628.

42. Wilson L, Hall H, McIntosh G, Melles T. Inter-tester reliability of a low back pain classification system. Spine (Phila Pa 1976). 1999;24:248-254.

NOTIFY JOSPT of Changes in Address

Please remember to let JOSPT know about changes in your mailing address. The US Postal Service typically will not forward second-class periodical mail. Journals are destroyed, and the USPS charges JOSPT for sending them to the wrong address. You may change your address online at www.jospt.org. Visit “INFORMATION FOR READERS”, click “Change of Address”, and select and complete the online form. We appreciate your assistance in keeping JOSPT’s mailing list up to date.

42-09 Henry.indd 805 8/21/2012 4:18:33 PM