Analysis of Spinal Fusion Versus Nonsurgical Treatment for Chronic Low Back Pain...

Analysis of Spinal Fusion VersusNonsurgical Treatment for Chronic Low

Back Pain Based on Elisabeth Svensson’sMethod

Yingyan Zhu

Department of Statistics

Uppsala University

Supervisor: Adam Taube

2014

Abstract

The utility of spinal fusion for patients with chronic low back pain is controversial. The aim of

the current study is to find out whether spinal fusion benefits the patients or not. Additionally,

two outcome scales, the Balanced Inventory for Spinal Disorders (BIS) and Oswestry Low Back

Pain Disability Index (ODI), are compared to detect which one is preferable. A prospective

randomized study concerning spinal fusion has recently been performed in Sweden. 74 patients

aged 18 to 65 with low back pain lasting longer than one year were randomized into two

treatment groups: a fusion group and a non-fusion group. Evaluation of the physical sensations

based on several outcome scales were conducted both before and one year after the treatment.

Elisabeth Svensson’s Method, which measures the level of changes both from systematic and

individual aspects, was applied to analyse the effect of spinal fusion. Measures of disorder

(D) and order consistency (MA) were calculated when comparing the two outcome scales. The

conclusions are that spinal fusion has shown significant advantages over non-surgical treatment

methods, and that BIS is recommended over ODI.

Keywords: Chronic Low Back Pain; Spinal Fusion; Svensson’s Method; BIS

Contents

Abstract i

1 Introduction 1

1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Previous Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3 The Aims of the Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2 Methodology 6

2.1 Study Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.2 Statistical Methods: Elisabeth Svensson’s Method . . . . . . . . . . . . . . . . 8

2.2.1 The Contingency Table . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.2.2 Rank Transformable Pattern of Change . . . . . . . . . . . . . . . . . 9

2.2.3 Systematic Change . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.2.4 Individual Change . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.2.5 The Measure of Disorder (D) and Order Consistency (MA) . . . . . . . 16

3 Spinal Fusion Versus Nonsurgical Treatment 18

3.1 Detailed Analysis of the Primary Endpoint Variable: Low Back Pain . . . . . . 18

3.1.1 The Percentage of Agreement . . . . . . . . . . . . . . . . . . . . . . 18

3.1.2 The Systematic Change . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3.1.3 The Individual Change . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3.2 Analysis of all the items in BIS . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.2.1 Descriptive Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.2.2 The Percentage of Agreement . . . . . . . . . . . . . . . . . . . . . . 22

3.2.3 The Systematic Change . . . . . . . . . . . . . . . . . . . . . . . . . . 22

i

3.2.4 The Individual Change . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3.3 Results Related to VAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4 Comparison of BIS and ODI 27

4.1 Comparison of the Primary Endpoint Variable . . . . . . . . . . . . . . . . . . 27

4.2 Comparison of Activities, Social Life and Sleeping . . . . . . . . . . . . . . . 29

4.3 Summary of the Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

5 Summary 32

5.1 Discussion and Further Study . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

5.2 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

Acknowledgements 34

References 35

Appendix 38

A Detailed Results About Spinal Fusion V.S. Nonsurgical Treatment Based on BIS . 38

B Detailed Results About Spinal Fusion V.S. Nonsurgical Treatment Based on ODI . 40

C The Balanced Inventory for Spinal Disorders (BIS) . . . . . . . . . . . . . . . . . 41

D Oswestry Low Back Pain Disability Index (ODI) . . . . . . . . . . . . . . . . . . 46

ii

Chapter 1

Introduction

1.1 Background

It is very common to suffer from back pain in daily life. According to the book "Low Back Pain

FAQs" published in 2007, back pain affects over 70% of people at a certain point in their lives

[6]. It is reported to be the most common cause of limited activities for those who are under 45

years old [2, 4, 6]. Although it might vanish itself after several days or weeks, back pain is the

second common symptom that requires visiting a physician [2, 7].

"Chronic Low Back Pain" is a kind of back pain in the lower spine, which lasts more than 12

weeks and may be a continuous problem [6]. Specifically, it is a nagging or aching pain from

the belt line to the tail-bone of the spine [6] (Figure 1.1). It happens mostly among people

more than 30 years old, especially those in ages between 45 to 65 [2, 6]. There are various

possible causes of a low back pain, such as poor posture, obesity, excessive physical labor,

fatigue and emotional stress [6]. Yet, the actual reason of its occurrence seems to be under-

investigated today [6]. The common treatment of chronic low back pain includes exercise,

massage, cognitive therapy, physical therapy, spinal fusion, etc.

For decades, spinal fusion, also known as "lumber fusion", "arthrodesis" or "lumbar spinal

fusion", has been used as a surgical way to relieve the low back pain [1, 12]. Initially it was used

to deal with infections, deformity, degenerative conditions, spinal tuberculosis and fractures,

and then gradually used to operate or cure lumbar instability and degenerative disc disease [1,

2, 3]. It minimizes or eliminates the pain by preventing the movement between vertebrae [2,

1

Figure 1.1: Spinal Column [6]

3]. For example, in some fusion techniques the disc is taken away and the two vertebrae are

put together in a single unit. There are several ways of spinal fusion, such as posterolateral

arthrodesis, lumbar interbody arthrodesis, posterior lumbar interbody arthrodesis, interbody

fusion cages, etc [2]. Each method is suitable in different situations of the patients and has

both advantages and disadvantages. So before a spinal fusion is performed, some specific

pathoanatomical diagnoses should be done in order to obtain a better timing for the surgery and

improve suitability of the fusion according to patient’s symptoms [2].

Undertaking spinal fusion is risky and costly. First, the radiography (X-ray) or similar methods

cannot show the exact segment of the spine in which the back pain occurs [6, 7, 8]. Sec-

ond, spinal fusion is associated with many complications like mortality, thrombosis, hepatitis,

gastrointestinal,urinary tract infection, pulmonary complication, skin problem, psychological

problems and coping problems [13]. Third, the costs are considered to be very high [2, 5, 9,

10].

The high costs, the increasing rates of re-operations and the risk for complications make the

question of whether spinal fusion actually helps the patients more and more urgent. Hence, it

is of great importance to conduct a research on this field.

2

1.2 Previous Studies

A lot of discussions have being going on whether spinal fusion should be recommended or not.

However very few clinical trials have been done. To the author’s knowledge, only four studies

have been conducted during the last 15 years in Europe. These studies have led to varying

conclusions (Table 1.1).

Table 1.1: Summary of the Four Previous Studies

The first study was a multicenter randomized controlled trial performed in Sweden by Fritzell

et al[14]. 294 patients were randomized into four groups, three of which were fusion groups

3

with different fusion techniques and one was a comparison group. 222 patients were assigned

to the three fusion groups and treated as one single group in the analysis. The comparison

group consisted of 72 patients. This study measured different aspects by using various scales

and the fusion group performed significantly better in almost all the aspects except "Depressive

Symptoms". The final conclusion was that fusion was more efficient than the commonly used

non-surgical treatment. Among the four recent studies, Fritzell et al.’s is the only contribution

that gives a positive conclusion for fusion. However, the big difference in number of patients

between fusion and non-fusion group has affected the results. Furthermore, Dr. Mooney ques-

tioned the criteria by which patients were assigned to a group [14], who thought these patients

were those who were not suitable for surgery. These two weaknesses of the study reduced the

reliability of its conclusion.

Two studies were performed by Brox and his colleagues [15, 16]. The first one was a random-

ized clinical trial with 64 (37+27) patients in Norway in 2003. The Oswestry Low Back Pain

Disability Index (ODI) was used as a primary scale and Mann-Whitney test was employed to

calculate the p-value for the difference in changes from baseline to 1-year follow-up. Accord-

ing to this single p-value (0.33), the author concluded that "The main outcome measure showed

equal improvement in patients with chronic low back pain and disc degeneration randomized

to cognitive intervention and exercises, or lumbar fusion" [15, p1913]. But if the secondary

outcome measure is also taken into consideration, the conclusion needs to be re-evaluated. The

secondary outcome measure has 12 aspects and six of them have p-values smaller than 0.05,

which means that at least in six out of twelve aspects significant difference between the two

groups can be seen. These results may indicate that fusion performs better with regard to at

least half of the aspects. The second study under the leadership of Brox was a similar random-

ized clinical trial with 60 (29+31) patients in 2006 has the similar problem [16]. Three out of

eight aspects showed significant differences between the two groups. But the final conclusion

was based on the single p-value concerned with the main measure, identified by the authors.

The research conducted by Fairbank et al [12] in 2005 in the United Kingdom was also a

multicenter randomized controlled trial, which involved 349 patients from 15 centres. Several

outcome scales were used in this study, and almost all the p-values of the difference in changes

were larger than 0.05. Hence, it was concluded that there was no clear evidence which shows

the spinal fusion has a significant benefit to the patients.

4

The last three studies reviewed above treated the results of the scales as scores which can be

regarded as continuous data and used statistical methods such as multiple regression which is

not appropriate for ordinal data. The outcome scales of such clinical research is an evaluation

of subjective physical sensations. These kinds of questionnaires are rating scales even if they

have a standing scoring method stated by the questionnaire itself. The mathematical operations

which were used were not appropriate to this kind of data.

Regardless of the statistical methods that have been used in these studies, the conclusions are

not consistent. So the uncertainty of whether the spinal fusion has a benefit to the patients with

chronic low back pain or not, still remains.

1.3 The Aims of the Study

The analysis of spinal fusion versus non-surgical treatment for chronic low back pain will be

conducted in this paper. The aim is to find whether the spinal fusion is of benefit to the patients

who suffer from chronic low back pain or not. For the analysis of the systematic and random

differences between paired ordinal categorical data, a rank-invariant non-parametric method,

which has not been used before in such randomized study related to spinal fusion, will be

applied in this study. Meanwhile, BIS and ODI will be compared in order to find out which one

is to be preferred.

5

Chapter 2

Methodology

2.1 Study Basis

A prospective randomized study concerning spinal fusion has recently been performed by the

Clinic of Spinal Surgery in Strängnäs, Sweden (CSS), an organization that is specialized in

spinal surgery. The patients involved into this study were randomized into two groups. One

group was given surgical treatment, while the other had the non-surgical treatment. (Detailed

treatment therapy for the two groups is listed in Table 2.1.) Each group has 37 patients, thus

there are a total of 74 patients involved in the study.

Table 2.1: Treatment Therapy for the Two Groups

Oral and written information were given to the patients before they were involved in the study

and informed consent were also given. The following criteria were followed for the inclusion

of patients [11]:

1. Age between 18 to 65;

2. The discomforts or symptoms of the low back pain has continued for at least one year;

3. Extensive conservative treatment that includes physical therapy should have been performed

at least seven weeks, but severe discomfort still continues;

4. No fusion surgery in the lumbar area is conducted before;

5. Patients should have distinctive symptoms such as severe low back pain located centrally

6

in the vertebral column being aching in character and combined with stabbing pain at sudden

movements, diffuse pain radiation down the legs and mostly a bladder dysfunction with urgency

and frequency.

These patients can be considered as a well-defined group with specific symptoms that can be

separated from other patients with severe back problems.

The operations were performed by Doctor Bo Nyström with micro-surgical techniques. The

patients in both groups were evaluated before the treatment and at one year after the treatment.

Several different outcome scales have been applied. They are the Balanced Inventory for Spinal

Disorders (BIS), Oswestry Low Back Pain Disability Index (ODI), 36-Item Short Form Survey

Instrument (SF-36) and European Quality of Life Scale (Euro-Qol). Among them BIS is the

newest one, and of special interest in this study. The others are already widely used in this field.

BIS and ODI will be compared later.

BIS is a questionnaire specific for detecting the influence of back and leg pain on physical

health, social life, mental health and the quality of life. The validity and reliability has already

been proved [17, 18]. It has 17 items in total for these four dimensions and the answering

alternatives are ordered categories like: "1. none, 2. negligible, 3. moderate, 4. rather severe,

5. very severe". The 17 items are: "Low Back Pain","Leg Pain","Limited Indoor Activities",

"Limited Outdoor Activities", "Physical Health", "Limited Leisure Activities", "Limited Social

Activities", "Limited Social Life", "Feeling Down", "Feeling Impatience", "Lack of Initiative",

"Concentration Difficulties", "Sleep Disturbance", "Overall Mental Health", "Limitation in Liv-

ing", "Perceived Concern" and "Overall Quality of Life". Two surveys about the evaluation of

patients’ physical sensations have been done. The BIS for the second time contains more ques-

tions about the changes after the surgery related to the corresponding item. These questions are

of the type: "How is your back pain today? (change in back pain at follow-up)" and the choices

of it are "1. no pain, 2. much better, 3. somewhat better, 4. unchanged, 5. somewhat worse, 6.

much worse". The whole questionnaire of BIS and ODI are shown in the appendix C and D.

According to the study protocol, the main variables are related to "Change in pain in the low

back and radiating to the legs and the discomfort related to limitations in activity one year

after treatment" [11, appendix 2, p3]. This so called primary endpoint contains a number of

variables. Therefore, in our analysis we have selected the first item in BIS, "Low Back Pain",

as the primary endpoint variable. All the other variables will be analysed in a similar manner.

7

2.2 Statistical Methods: Elisabeth Svensson’s Method

The answers obtained from the questions are rating scales. Although the answers are labelled by

numbers, they have only ordered meaning but not a numerical sense [19]. Addition, subtraction

and any parametric methods are inappropriate and have no empirical meanings [20]. So, the

differences between different scale points do not give an adequate measure for the change in

this study.

Methods like sign test and McNemar’s test are suitable for analysing ordered categorical data[21,

22, 23]. However, they only test whether a change between two occasions does exist or not,

and thus do not efficiently use the information from the data [21, 22]. A rank-invariant non-

parametric method has been developed by Svensson in 1993 for the analysis of ordered scales

which measures the level of change both from systematic and individual aspects [24, 25]. This

method will be used here to analyse the effect of the fusion.

Svensson’s method has a lot of good properties, but it is not yet widely used. Therefore, it will

be introduced in detail with the primary endpoint variable.

2.2.1 The Contingency Table

A contingency table like Figure 2.1 (a) is formed as a basis for Svensson’s method. The X-axis

stands for the observations before treatment while the Y-axis stands for the observations after

treatment. The numbers in each cell represent how many patients had chosen i:th category

before treatment and j:th category after. For example, the number "4" in the (Rather Severe,

Moderate) cell means there were four patients who felt the pain in the low back as "rather

severe" before the treatment and "moderate" after. Apparently, the treatment was effective for

these four people.

The sums of each row and column form the marginal distributions for the two occasions sep-

arately. The overall systematic change is evaluated with the help of this [22]. By comparing

the marginal distributions in Figure 2.1 (a) and (b), it is clear that more patients shifted their

choices to lower categories in the fusion group than in the comparison group for item "Low

Back Pain".

8

(a) Fusion Group (b) Comparison Group

Figure 2.1: The Frequency Distribution of the Pairs Before and After Treatment for "Low Back

Pain"

The lower-left to upper-right line of the contingency table is the agreement line where the

patients have given the same answer at the two time points. So the percentage agreement (PA)

which shows the proportion of the patients who did not change their choices is defined as:

PA =Sum of the Observations on the Agreement Line

Total Observations(2.1)

2.2.2 Rank Transformable Pattern of Change

Starting from the marginal distributions in Figure 2.1 (a), we can construct the so called rank

transformable pattern of change (RTPC) table (Figure 2.2) by means of the following rules:

1. The sum of the first column is zero, so only "0" observations can be put in the cell (None,

None);

2. The sum of the lowest row is 13, since the (None, None) cell has only zero observations,

"13" may be put in the cell (Negligible, None). But the sum of the second column is also zero,

only "0" can be put there. Then "13" can be put in the cell (Moderate, None). Since 13 is

smaller than the column total 14, "13" can be put there;

3. Only one observation is left for the third column, since one is smaller than the row total 11,

"1" can be put in the cell (Moderate, Negligible);

4. Then ten observations are left for the second lowest row, and ten is smaller than the column

total 18, so "10" can be put in the cell (Rather Severe, Negligible);

9

Figure 2.2: The RTPC Table of Item "Low Back Pain" for Fusion Group

5. Continuing with similar calculations until the sum of the whole table equals the total sample

size.

RTPC table shows the only pattern where the ranking of the patients is exactly the same both

before and after. RTPC is very important because it is the key to Svensson’s method which

describes the systematic shift between the two time points.

2.2.3 Systematic Change

2.2.3.1 Relative Position, RP

The systematic change in position between the two ordered categorical classification is called

the relative position (RP) and it is theoretically defined as P (X < Y )− P (Y < X), where X

and Y denote the two sets of categorical distributions for before and after treatments separately

[18, 22-24]. The empirical measure can be calculated as [24, 25]:

RP = p0 − p1 =1

n2

m∑v=1

yvC(X)v−1 −1

n2

m∑v=1

xvC(Y )v−1 (2.2)

where

n: the sum of the observations;

xv and yv: the vth category frequencies;

C(X)v and C(Y )v: the vth category cumulative frequency of the marginal distributions X , Y .

10

The value of RP ranges from -1 to 1 since it is the difference of two probabilities [22, 24]. The

value of RP will reach the lower limit -1 if the marginal distribution on occasion Y completely

shifts to X and vice versa [22, 24]. An RP close to zero indicates that negligible systematic

changes in position has occurred between the two occasions [22, 24]. In this case, if the RP

value is smaller than zero, it implies a systematic change towards lower categories, which is a

desired effect in this study.

2.2.3.2 Relative Concentration, RC

In order to have a comprehensive evaluation of the systematic change, the systematic change

in concentration (RC) is defined by P (Xl1 < Yk < Xl2) − P (Yl1 < Xk < Yl2), where Xl1 ,

Yk, Xl2 , Yl1 , Xk and Yl2 denote the discrete independent random variables [22, 23, 24]. The

empirical measure can be calculated by [24, 25]:

RC =1

Mn3

{m∑v=1

[yvC(X)v−1(n− C(X)v)− xvC(Y )v−1(n− C(Y )v)]

}(2.3)

where M = min[p0 − p20, p1 − p21].

The RC value ranges from−p(1−p) to p(1−p), where p = P (Xk ≤ Yl1). It will be larger than

zero when the marginal distribution of occasion Y is more concentrated relative to the marginal

distribution of occasion X [24, 26].

2.2.3.3 Standard Error and 95% Confidence Intervals of RP and RC

The measures of RP and RC are asymptotically normally distributed and the variance of the em-

pirical measures of them are estimated with the help of jackknife technique [21-24]. According

to Svensson’s proof, V AR(RP ) can be obtained by the following process:

σ2jack(RP ) =

n− 1

n

n∑κ=1

(RP(κ) −RP(•))2 (2.4)

where

RP(κ): the Relative Position with one observation, κ, deleted;

11

RP(•): the average of all possible values of Relative Position with one observation deleted,

κ = 1, ..., n.

Further,

V AR(RP ) = (n− 1

n)2σ2

jack(RP ) (2.5)

Then,

SE(RP ) =n− 1

nσjack(RP ) (2.6)

And the 95% confidence interval will be:

RP ± 1.96× SE(RP ) (2.7)

The standard error and 95% confidence intervals of RC can be obtain by analogous formulas

by changing all the RP’s to RC’s in formula 2.4-2.7.

2.2.3.4 ∆RP and ∆RC

In order to test if there are significant difference between the two groups, ∆RP = RPC−RPF ,

∆RC = RCC −RCF and the corresponding 95% confidence intervals are calculated.

The two groups are independent, so the RP and RC values for the two groups are independent.

The stand error of ∆RP and ∆RC can be obtained by:

SE(∆RP ) =√V AR(∆RP ) =

√V AR(RPC −RPF ) =

√V AR(RPC) + V AR(RPF )

(2.8)

SE(∆RC) =√V AR(∆RC) =

√V AR(RCC −RCF ) =

√V AR(RCC) + V AR(RCF )

(2.9)

12

where V AR(RPF ), V AR(RPC), V AR(RCF ) and V AR(RCC) can be obtained by formula

2.5.

The approximate 95% confidence intervals for ∆RP and ∆RC are:

(RPC −RPF )± 1.96SE(RPC −RPF ) (2.10)

(RCC −RCF )± 1.96SE(RCC −RCF ) (2.11)

2.2.3.5 ROC Curve

The systematic change can also be illustrated graphically by a relative operating characteristic

(ROC) curve. The ROC curve is obtained by plotting the pairs of cumulative proportional

marginal frequencies with starting point (0, 0) and ending point (1, 1). The relationship between

the ROC curve and the RP values is p0 = 12-A+B, p1 = 1

2-B+A, so RP = p0 − p1 =2(B-A) if

A and B represent the areas between the curve and the diagonal (Figure 2.3).

Figure 2.3: An S-shaped ROC curve Figure 2.4: The ROC Curves of the Fusion

Group and the Comparison Group for the 1st

item in BIS

A curve above the diagonal indicates a systematic change towards lower categories while a

concave curve below the diagonal stands for a systematic change towards higher categories.

The greater the deviation, the stronger is the systematic change between the two time points

[22-25]. Since a lower category stands for a better situation for the patients, it is obvious that

13

the fusion group performs far more better in relieving the back pain extent according to Figure

2.4.

Thus, we have already reached the basic conclusion concerning the primary endpoint in the

study protocol: Fusion is superior to non-fusion for these patients.

2.2.4 Individual Change

2.2.4.1 Augmented Mean Ranks

In order to fit the two rankings (before and after) together as much as possible without violating

the ordinary ranking, augmented mean ranks are calculated [24]. It is the rank related to pairs.

An observation in ij:th cell in a contingency table has two augmented mean ranks because it

can be calculated based on the occasion before the treatment or after. (Note: Here the i and j

stands for the i:th and j:th category for the two occasions respectively, but not the rows and the

columns for the table.) Notations R(1)ij and R(2)

ij stand for the two augmented mean ranks, where

"1" stands for the occasion "Before the Treatment" and "2" stands for that "After Treatment".

If it is based on the before treatment occasion, the observations in the ij:th cell will have the

ranks ranging from∑i−1

i1=1 xi1·+∑j−1

j1=1 xij1 +1 to∑i−1

i1=1 xi1·+∑j

j1=1 xij1 . Then the augmented

mean rank can be obtained by:

R(1)ij =

(∑i−1

i1=1 xi1· +∑j−1

j1=1 xij1 + 1) + (∑i−1

i1=1 xi1· +∑j

j1=1 xij1)

2

=i−1∑i1=1

xi1· +

j−1∑j1=1

xij1 +1

2(1 + xij) (2.12)

Similarly, R(2)ij can be calculated by:

R(2)ij =

j−1∑j1=1

x·j1 +i−1∑i1=1

xi1j +1

2(1 + xij) (2.13)

R(1)ij and R

(2)ij are only valid when xij ≥ 1, where xij is the frequency in ij:th cell and

i,j=1,2,...,m.

14

If R(1)ij = R

(2)ij for all ij, it is a rank transformable situation and all the differences between the

two occasions are caused by systematic changes. The greater the difference between these two

augmented mean ranks, the more individual changes are responsible for the differences before

and after the treatment [24].

2.2.4.2 Relative Rank Variance (RV) and Internal Rank Variance (IV)

The individual change is a random component of the disagreement between the two occasions

and can be described by the relative rank variance (RV), the observable part, and the internal

rank variance (IV), the unobservable part, together.

RV is related to the difference between the augmented mean ranks and can be calculated by:

RV =6

n3

m∑i=1

m∑j=1

(R(1)ij − R

(2)ij )2xij (2.14)

where i,j=1,2,...,m.

The value of RV ranges from zero to one and will never reach one. If RV=0, the disagreement

pattern between two occasion is rank transformable, which means no observable individual

changes.

Due to the restricted number of ordered categories, there will be unobservable individual vari-

ance [24]. IV is an empirical measure to catch this:

IV =1

n3

m∑i=1

m∑j=1

xij(x2ij − 1) (2.15)

The value of IV also ranges from zero to one and will never reach one. The smaller the value of

IV is, the more information is obtained from the sources of difference between two occasions

[24]. So for IV value, the smaller the better.

2.2.4.3 Augmented Correlation Coefficient, rα

A measure of the closeness of observations to RTPC is defined as the augmented correlation

coefficient (rα) [27, 28]. It links RV and IV together and can be calculated as:

15

rα = 1− n3RV

(n3 − n)− n3IV(2.16)

rα belongs to the Pearson correlation family and ranges from -1 to 1. The higher the value,

the lower is the individual variation from the RTPC [27]. rα = 1 means that all pairs of

augmented mean ranks are equal and that all the changes which happened after the treatment

can be explained by systematic changes [27].

A hypothesis test with H0 : rα = 1 versus H1 : rα < 1 can be conducted to test if any

individual changes do exist [27]. The test statistic is approximately normal distributed:

z = (rα − 1)√

(n− 1) (2.17)

Then a p-value can be easily obtained by various statistic software. If a p-value is smaller than

the standard significance level 0.05, the null hypothesis is rejected and individual changes are

considered to exist.

2.2.5 The Measure of Disorder (D) and Order Consistency (MA)

The measure of disorder (D) and order consistency (MA) are two measurements used for com-

paring the consistency between two occasions and will be used to compare whether BIS and

ODI are consistent or not.

Two different scales for the same variable can be illustrated by a contingency table [29] (eg.

Figure 4.2). The pairs on the main diagonal direction are considered as consistent. For instance,

the pairs of ("very severe" ODI,"rather severe" BIS) are consistent in ordering to ("fairly sever"

ODI, "moderate" BIS) [17]. On the contrary, the pairs in cells opposite the main direction are

considered as disordered.

The empirical measure of disorder (D) is defined as [17, 29, 30]:

D =2∑m1

i=1

∑m2

j=1 xijxlrij

n(n− 1)− t(2.18)

where

t =∑m1

i=1

∑m2

j=1 xij(xij − 1);

16

xij: ij:th cell frequency;

xlrij: the lower-right region frequencies relative to the ij:th cell;

i,j: the categories of the two scales, i = 1, 2...m1, j = 1, 2...m2.

The measure of order consistency (MA) is the difference between the proportions ordered and

discorded pairs. Hence:

MA = (1−D)−D = 1− 2D (2.19)

17

Chapter 3

Spinal Fusion Versus Nonsurgical

Treatment

3.1 Detailed Analysis of the Primary Endpoint Variable: Low

Back Pain

3.1.1 The Percentage of Agreement

Based on the data in Figure 2.1 (a), PA of the fusion group can be obtained by PA = 0+0+2+2+137

=

13.51%, while for Figure 2.1 (b) PA = 51.35%. This indicates that there are more disagree-

ment in the fusion group for "Low Back Pain".

3.1.2 The Systematic Change

According to formula 2.2, 2.3 and 2.7, the RP and RC value and the corresponding 95 % con-

fidence intervals of the fusion group can be calculated by the following process:

C(X): 0, 0, 14, 32, 37;

C(Y ): 13, 24, 32, 35, 37;

p0 = 1372

(13× 0 + 11× 0 + 8× 0 + 3× 14 + 2× 32) ≈ 0.07742878;

p1 = 1372

(0× 0 + 0× 13 + 14× 24 + 18× 32 + 5× 35) ≈ 0.794010226;

RP = p0 − p1 ≈ −0.7166.

18

M = min[0.07742878− 0.077428782, 0.794010226− 0.7940102262] = 0.07143356

RC =1

0.07143356× 373×[11× 0× (37− 0)− 0× 13× (37− 24)

+ 8× 0× (37− 14)− 14× 24× (37− 32)

+ 3× 14× (37− 32)− 18× 32× (37− 35)

+ 2× 32× (37− 37)− 5× 35× (37− 37)]

≈ −0.7246

σ2jack(RP ) ≈ 0.00697299

SE(RP ) = 37−137×√

0.00697299 ≈ 0.08124757

Then the 95 % confidence interval of RP can be easily obtained: (-0.8758, -0.5573). These

measures for the comparison group can also be calculated through the similar process. The

summary of all these measures of systematic changes for the two groups based on the main

variable are shown in Table 3.1.

Table 3.1: Summary of the Measurs for Systematic Change Based on the Main Varible

The negative RP value for the fusion group indicates that after surgery a lot of patients shifted

their choices to lower categories. The close-to-zero value of RP for the comparison group

reveals that those, who did not have the spinal fusion, did not experience an obvious change or

even got worse.

The RC value of the fusion group is smaller than zero, which means the marginal distribution

of "Low Back Pain" after the fusion is less concentrated relative to the marginal distribution

of that before the fusion. This is reasonable because the patients concentrated on choice "3.

moderate, 4. rather severe, 5. very severe" before the surgery while some shifted to "1. none, 2.

negligible" after (Figure 2.1 (a)). While the RC value for the comparison group is very close to

zero and can be neglected. In fact, the 95 % confidence interval of RP and RC for comparison

19

group both cover zero. So, the systematic change of the first item in BIS for the comparison

group can be considered as negligible.

It can be concluded that the treatment of the fusion group has some effect on the patients while

the treatment of the non-fusion group has not.

The 95% confidence interval of ∆RP and ∆RC for the main variable "Low Back Pain" can be

obtained by the following process:

∆RP = 0.0935− (−0.7166) = 0.8101 and ∆RC = 0.0385− (−0.7246) = 0.7631;

SE(∆RP ) =√

0.08124762 + 0.10050802 ≈ 0.12924;

SE(∆RP ) =√

0.10235152 + 0.11279232 ≈ 0.15231;

(RPC −RPF )± 1.96SE(RPC −RPF ) = 0.8101± 1.96× 0.12924 ≈ (0.5568, 1.0634);

(RCC −RCF )± 1.96SE(RCC −RCF ) = 0.7631± 1.96× 0.15231 ≈ (0.4646, 1.0616).

None of the 95% confidence intervals does cover zero, which means that there is a significant

difference in systematic changes between the groups. The conclusion is that the fusion group

performed better on variable "Low Back Pain".

3.1.3 The Individual Change

Based on formula 2.14-2.17, the value of the measures for individual changes can be obtained

and the results are shown in Table 3.2.

Table 3.2: Summary of the Measurs for Individual Change Based on the Main Varible

The value of IV, the unobservable part of the individual change, for both groups are very close

to zero, so that most of the individual changes are observable. If only RV values are considered,

the value for the surgical group is larger than that for un-surgical group indicating that the fusion

group has more individual variance. The p-value of rα is smaller than 0.05 for the fusion group,

while larger than 0.05 for non-fusion group. With the help of rα, it is clear that the individual

change can be neglected for the comparison group on item "Low Back Pain" while fusion has

different influence on patients with various situations.

20

3.2 Analysis of all the items in BIS

3.2.1 Descriptive Analysis

The median values of the items, which stand for the average level of the variables, are shown

in Table 3.3. For the column "First Survey", "Second Survey" and "b Question", the lower the

number is, the better is the situation. As for "Progress", it is the difference between the values

of the same variable for the first and the second survey. For this measure’s value, the higher

the better. It is obvious that the fusion group had more positive changes than the comparison

group since almost all the values of the "Progress" for fusion group are larger than that of the

comparison group. The "b Question" is about the change of the corresponding item within one

year after the operation. All the values of "b questions" for the fusion group are smaller than

that for the other group which also indicates that the patients in the fusion group experienced

more positive changes than those in the comparison group.

Table 3.3: The Average Performance on the Items of the BIS

21

3.2.2 The Percentage of Agreement

The percentage of agreement (PA) of the fusion group ranges from 8.11% to 37.84 %, while the

PA of the comparison group ranges from 36.11% to 70.27%. The lower PA of the fusion group

implies that there are more changes after the treatment for the fusion group. Improvements

are expected since the operations should be beneficial to the patients. So further studies of the

direction of the changes are desirable.

3.2.3 The Systematic Change

The systematic change can give us further explanation about whether the treatment really has

an effect on the patients or not. For the fusion group, all the ROC curves of the variables are

above the diagonal (Figure 3.1), which shows that the surgery had a positive effect on patients

for all items included in BIS. The same conclusion can be obtained from the fact that the RP

value of all questions for the fusion group are significantly smaller than zero. As for the RC

values, only four confidence intervals do not cover zero for the fusion group (Figure 3.2). The

corresponding items are "Limited Social Life", "Limitation in Living", "Perceived Concern"

and "Low Back Pain", which is already shown in chapter 3.1.2. (VAS is not included in the

BIS.) To get a closer picture of these variables, the marginal distributions of each variable are

shown in Table 3.4. It can be seen that they are concentrated around the last two or three choices

before the fusion and shifts to the first two or three choices after. So this does not influence

the conclusion that for the fusion group, patients choose lower categories after one year of the

surgery. Therefore, we will focus on the RP values when discussing the systematic change.

Table 3.4: The Marginal Distribution of the Four Variables for Both Situations in Fusion Group

The comparison group shows another pattern. In Figure 3.1, the ROC curve of the comparison

22

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Low Back Pain

Before Treatment

Afte

r Tr

eatm

ent

FusionComparison

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Leg Pain

Before Treatment

Afte

r Tr

eatm

ent

FusionComparison

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Limited Indoor Activities

Before Treatment

Afte

r Tr

eatm

ent

FusionComparison

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Limited Outdoor Activities

Before Treatment

Afte

r Tr

eatm

ent

FusionComparison

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Physical Health

Before Treatment

Afte

r Tr

eatm

ent

FusionComparison

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Limited Leisure Activities

Before Treatment

Afte

r Tr

eatm

ent

FusionComparison

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Limited Social Activities

Before Treatment

Afte

r Tr

eatm

ent

FusionComparison

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Limited Social Life

Before Treatment

Afte

r Tr

eatm

ent

FusionComparison

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Feeling Down

Before Treatment

Afte

r Tr

eatm

ent

FusionComparison

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Feeling Impatience

Before Treatment

Afte

r Tr

eatm

ent

FusionComparison

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Lack of Initiative

Before Treatment

Afte

r Tr

eatm

ent

FusionComparison

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Concentration Difficulties

Before Treatment

Afte

r Tr

eatm

ent

FusionComparison

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Sleep Disturbance

Before Treatment

Afte

r Tr

eatm

ent

FusionComparison

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Overall Mental Health

Before Treatment

Afte

r Tr

eatm

ent

FusionComparison

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Limitation in Living

Before Treatment

Afte

r Tr

eatm

ent

FusionComparison

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Perceived Concern

Before Treatment

Afte

r Tr

eatm

ent

FusionComparison

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Overall Quality of Life

Before Treatment

Afte

r Tr

eatm

ent

FusionComparison

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

VAS.Back

Before Treatment

Afte

r Tr

eatm

ent

FusionComparison

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

VAS.Leg

Before Treatment

Afte

r Tr

eatm

ent

FusionComparison

Figure 3.1: The Comparison of ROC Curves of the Fusion Group and the Comparison Group

23

group is close to the diagonal for almost all items, some of them even below the diagonal line.

Hence, treatment with physiotherapy and cognitive therapy, but without the fusion, does not

change the condition of the patients. Sometimes even a negative effect is found.

Figure 3.2: The Comparison of 95 % CI(RP and RC) for the Fusion Group and the Comparison

Group

Figure 3.3: The 95 % CI(∆RP and ∆RC) for Each Items on the BIS

∆RP and ∆RC value’s corresponding 95% confidence interval of all the items are shown in

Figure 3.3. (The exact value of ∆RP and ∆RC are displayed in appendix A.) For all the

variables, there is no 95% confidence intervals of ∆RP that covers zero, so the difference of

RP for the two groups are all statistical significant. Almost all the 95% confidence intervals of

24

∆RC values do cover zero, which means that the groups have no significant differences in the

change of relative concentration. But as stated before, this will not affect the final results. In

summary, we can conclude once again that spinal fusion has positive effect on the patients who

are suffering from the low back pain.

3.2.4 The Individual Change

Though the fusion seems effective for treating the low back pain, the influence may differ

among the individuals. All the IV values which measure the unobservable part of the indi-

vidual change before and after the treatment are negligible. So the individual changes can be

considered to be observable. Large RV and small rα values reflect that the significant individ-

ual variance may exist. The normal distribution is used to test the null hypothesis of rα = 1,

and the related p-values are shown in Table 4 in the appendix with italic number. Except "Low

Back Pain", which is mentioned in chapter 3.1.3, the p-values of the items "Limited Indoor

Activities", "Physical Health", "Limited Social Activities", "Limited Social Life", "Sleep Dis-

turbance", "Limitation in Living" and "Overall Quality of Life" in the fusion group are also

smaller than 0.05, so that the null hypothesis is rejected for the eight items and it is considered

that individual variance exists in these items. This reveals that there are significant individual

differences which are not covered by the systematic change. So the spinal fusion has varying

influence on the patients in these eight variables. Only two corresponding p-values in the non-

fusion group are smaller than 0.05, indicating that for the comparison group the changes are

mainly explained by the systematic change.

25

3.3 Results Related to VAS

The VAS scales are treated as ordinal data in this study. The PA value is very close to zero for

both groups. It is possible because VAS has 100 categories and this lowers the possibility of

that an individual will choose the same category at two separate time points. As with the RP

values or the ROC curves for the two groups, the conclusion is that the fusion group experi-

enced a positive outcome. However, the small rα reveals that individual differences should be

considered when deciding if a certain patient should have the fusion or not.

Detailed information about PA, RP, RC, RV, IV and rα values of each item for the two groups

are shown in the appendix A.

26

Chapter 4

Comparison of BIS and ODI

4.1 Comparison of the Primary Endpoint Variable

Similar analysis can be performed based on Oswestry Low Back Pain Disability Index (ODI)

and this gives the same conclusion in this study. (Detail results based on ODI when the answers

are treated as ordinal data can be found in appendix B.) According to formula 2.18 and 2.19,

measure of disorder (D) and order consistency (MA) are calculated for similar questions in

both outcome scales in order to detect if they are consistent with each other.

Table 4.1: D and MA Between the Pain Extent Assessments on the BIS and ODI

The value of MA is very high between "BIS 1: Low Back Pain" and "ODI 1: Pain Intensity"

before and after the fusion. Figure 4.1 gives the plots of the marginal distributions of the first

item for both questionnaires and situations (before and after) for the fusion group. Comparing

the two red lines in Figure 4.1 (a) and (b), the main reason of the disorder before the treatment

was BIS’s responses were more focused on "Rather Severe" while ODI’s responses were more

focused on "Moderate". If the contingency table is checked, twelve patients chose "Rather Se-

vere" in BIS but "Moderate" in ODI (Figure 4.2 (a)). On the other hand, after-fusion situation

27

010

2030

4050

BIS 1: Low Back Pain

Pro

babi

lity

Before TreatmentAfter Treatment

None Negligible Moderate Rather Severe Very Severe

(a) BIS

010

2030

4050

ODI 1: Pain Intensity

Pro

babi

lity


No Pain Mild Moderate Fairly Severe Very Severe Unbearable

(b) ODI

Figure 4.1: The Marginal Distribution of the First Item of BIS and ODI (Fusion Group)

shows high consistency. In Figure 4.2 (b), the non-zero numbers are very close to the lower-left

to upper-right line. Consequently, the blue lines in Figure 4.1 (a) and (b) have very similar

shapes, which is consistent to the fact that the value of MA after fusion is higher than that be-

fore. What has been discussed above may also reveal that BIS is more sensitive on this question

since it has been focused on higher categories than ODI while changed to same concentration

after.

(a) (b)

Figure 4.2: The Contingency Table of the Pairs of the First Item for BIS and ODI (Fusion

Group)

The comparison group has a similar situation when comparing the consistency between "BIS

1: Low Back Pain" and "ODI 1: Pain Intensity". This indicates that BIS and ODI show very

28

similar results on the measure of low back pain.

By contrast, the value of MA between "BIS 2: Leg Pain" and "ODI 1: Pain Intensity" is

moderate, which reveals these two questions do not give the same information. Thus, BIS

provided more information on chronic low back pain than the other scales.

4.2 Comparison of Activities, Social Life and Sleeping

The value of D and MA between the questions about activities on both questionnaires are listed

in Table 4.3. It seems that BIS and ODI gives more different responses in this aspect because

the consistency is lower than that in the pain extent dimension.

However, BIS is still considered better than ODI in this area since the way it summarizes the

different activities is more reasonable and informative.

Social life and sleeping problems are the other two aspects which are included in both BIS and

ODI. The D and MA values are shown in Table 4.2. The consistency between the pairs related

to social life is not very high. BIS has two questions related to social life while ODI only has

one. However, readers may get confused since the two questions of social life on BIS has no

clear boundary. Therefore, ODI may give a better results in this aspect.

Table 4.2: D and MA Between the Social Life and Sleeping Assessments on the BIS and ODI

The questions concerning sleeping problems in both questionnaires have a high consistency.

Since no big differences between the situations before and after the treatment without fusion

are found, only the marginal distributions of before and after fusion are plotted in Figure 4.3.

29

Tabl

e4.

3:D

and

MA

Bet

wee

nth

eA

ctiv

ities

Ass

essm

ents

onth

eB

ISan

dO

DI

30

Before the fusion, the responses of BIS were more focused on "Rather Severe" while ODI’s

responses were more concentrated on "Moderate". After the fusion, both of them mostly shifted

to "None". Therefore, BIS seems to be more sensitive to this question since it has reflected more

changes.

010

2030

4050

BIS 14: Sleep Disturbance

Pro

babi

lity


None Negligible Moderate Rather Severe Very Severe

010

2030

4050

ODI 7: Sleeping

Pro

babi

lity


No Pain Mild Moderate Fairly Severe Very Severe Unbearable

Figure 4.3: The Marginal Distribution of the Sleeping Assessment of BIS and ODI (Fusion

Group)

4.3 Summary of the Comparison

Except what has been discussed above, BIS has more questions about mental health and overall

situations that have not been covered by ODI. Hence, BIS contains more information compared

with ODI. The most important point is that BIS uses descriptive scales but ODI uses numeri-

cal rating scales with a scoring instruction. As has been mentioned, scoring methods are not

appropriate for ordinal data.

Therefore, BIS is recommended over ODI for its descriptive scales and more information ca-

pacity.

31

Chapter 5

Summary

5.1 Discussion and Further Study

Since the actual reason for the chronic low back pain still remains unknown, it is difficult to

provide a "perfect" prescription to cure this disease. Although spinal fusion is an option, the

fact that no techniques can detect which is the exact segment that causes the pain today, makes

this option open to question.

Not very many randomized studies related to whether spinal fusion really eliminates the low

back pain have been done. Also, the statistical methods used do vary among these randomized

studies. The current study is a supplement in this area and a more appropriate method has

been applied in the statistical analysis. Svensson’s method is a rank-invariant non-parametric

method where no mathematical operation are included which are inappropriate for ordinal data.

Further, this method can isolate the systematic and individual changes. The conclusion using

this method is that the spinal fusion is to be recommend to the patients.

However, a lot of work concerning spinal fusion still remains to be done. First, which kind of

patients obtain more benefits from this surgery? Since the patients in the fusion group have

more individual variances based on this study, this question can be researched further. Second,

several ways of spinal fusion can be chosen, which surgical method is the most suitable for a

specific kind of patients? Third, more different follow-up time periods can be included. With

the passage of time, the situation will always change and several follow-up time periods will

provide more information of the changes. Forth, simulation studies can be done to compare

32

Svensson’s method with other widely used methods in order to detect the stability and the

reliability of each method in such randomized study.

5.2 Conclusion

The conclusion so far is that the spinal fusion had a significant positive effect on patients with

chronic low back pain in this study. Also, the Balanced Inventory for Spinal Disorders (BIS) is

superior to Oswestry Low Back Pain Disability Index (ODI) in randomized studies related to

spinal fusion.

33

Acknowledgements

Foremost, I would like to express my deepest gratitude and appreciation to my supervisor Prof.

Adam Taube for his excellent guidance, constructive suggestions, immense knowledge and

great patience. Besides, I would like to thank Dr.Bo Nyström for providing the fruitful idea

of this thesis and the data source and giving great support related to the medical parts in this

thesis. Also, thanks to Birgitta Schillberg, the co-worker of Dr.Bo, for collecting the answers

from the patients and arranging the data into the Excel. My sincere appreciation also goes to

Elisabeth Dahlqwist, a Master students in Statistic Department of Uppsala University, for the

sharing of her Bachelor thesis (co-author: Erika Åhlander) and the help of reading the Swedish

study protocol. For the assistance of R programming I want to thank Darlene LY. Dai, who is

a statistician work in NCE CECR Prevention of Organ Failure (PROOF) Centre of Excellence,

Vancouver, Canada. I thank my friend Cheng Gong and the teacher from English language

workshop,Jennifer Ast, for their advice on the language. Finally, my thanks will go to my

boyfriend who has never appeared in the past 25 years, which enables me to focus on my study

and research in statistics.

34

References

[1] Wiltfong R.E., Bono C.M., Charles Malveaux W.M.S. et al. Lumbar interbody fusion:

Review of history, complications, and outcome comparisons among methods. Current Or-

thopaedic Practice 2012;23(3):193-202.

[2] Edward N.H., Stephen M.D. Current Concepts Review: Lumbar Arthrodesis for the Treat-

ment of Back Pain. J Bone Joint Surg 1999;81-A(5):716-730.

[3] Deyo, R.A., Nachemson, A. Mirza, S.K. Spinal-Fusion Surgery - The Case for Restraint. N

Engl J Med 2004;350(7):722-726.

[4] Cherkin, D.C., Deyo, R.A., Loeser, J.D., et al. An international comparison of back surgery

rates. Spine 1994;19(11):1201-6.

[5] Katz, J.N. Lumbar spinal fusion: Surgical rates, costs, and complications. Spine 1995;20(24):78S-

83S.

[6] Gutknecht D. Low back pain FAQs. 1st ed. Hamilton, Ont.: BC Decker; 2007.

[7] Borenstein D., Calin A. Fast Facts : Low Back Pain (2nd Edition).Abingdon, Oxford, GBR:

Health Press Limited, 2012.

[8] Nyström B. Spinal fusion in the treatment of chronic low back pain: rationale for improve-

ment. The open orthopaedics journal. 2012;6:478.

[9] Deyo, R.A., Cherkin, D.C., Loeser, J.D. et al. Morbidity and mortality in association with

operations on the lumbar spine. The influence of age, diagnosis, and procedure. J Bone Joint

Surg - Ser. A 1992;74(4):536-543.

[10] Deyo, R.A., Ciol, M.A., Cherkin, D.C. et al. Lumbar spinal fusion: A cohort study of com-

plications, reoperations, and resource use in the Medicare population. Spine 1993;18(11):1463-

70.

[11] Nyström B. Ansökan Om Etikprövning 2007.

[12] Fairbank J, Frost H, Wilson-MacDonald J, Yu L-M, Barker K, Collins R. Randomised

controlled trial to compare surgical stabilisation of the lumbar spine with an intensive rehabil-

itation programme for patients with chronic low back pain: the MRC spine stabilisation trial.

35

Br Med J 2005; 330: 1233-9.

[13] Fritzell, P., Hägg, O. Nordwall, A. 2003, Complications in lumbar fusion surgery for

chronic low back pain: comparison of three surgical techniques used in a prospective random-

ized study. A report from the Swedish Lumbar Spine Study Group. Eur Spine J. 2003;12(2):178-

189.

[14] Fritzell P., Hägg O., Wessberg P., Nordwall A. Volvo award winner in clinical studies:

Lumbar fusion versus nonsurgical treatment for chronic low back pain. Spine 2001; 26(23):

2521-34.

[15] Brox J.I., Sörensen R., Friis A., et al. Randomized clinical trial of lumbar instrumented

fusion and cognitive intervention and exercises in patients with chronic low back pain and disc

degeneration. Spine 2003; 28(17):1913-21.

[16] Brox J.I., Reikerås O., Nygaard Ö., et al. Lumbar instrumented fusion compared with

cognitive intervention and exercises in patients with chronic back pain after previous surgery

for disc herniation: A prospective randomized controlled study. Pain 2006; 122: 145-155.

[17] Svensson E., Schillberg B., Kling A., et al. The balanced inventory for spinal disorders: the

validity of a disease specific questionnaire for evaluation of outcomes in patients with various

spinal disorders. Spine. 2009;34(18):1976-1983.

[18] Svensson E., Schillberg B., Kling A., et al. Reliability of the balanced inventory for spinal

disorders, a questionnaire for evaluation of outcomes in patients with various spinal disorders.

J Spinal Disord Tech. 2012;25(4):196-204.

[19] Svensson E. Guidelines to statistical evaluation of data from rating scales and question-

naires. J Rehab Med. 2001;33(1):47-48.

[20] Hand D.J. Statistics and the theory of measurement. J. Roy. Statist. Soc. Ser. A (Statistics

in Society). 1996;159(3):445-492.

[21] Svensson E., Starmark J., others. Evaluation of individual and group changes in social

outcome after aneurysmal subarachnoid haemorrhage: a long-term follow-up study. Journal of

Rehabilitation Medicine. 2002;34(6):251-259.

[22] Yang Y. Comparison of change between groups with data having rank-invariant properties

only [Dissertation]. Örebro University, 2009.

36

[23] Dahlqwist E., Åhlander E. Understanding elisabeth svensson?s method for paired ordered

categorical data and its applications on real life data [Dissertation]. Uppsala University 2012.

[24] Svensson E. Analysis of systematic and random differences between paired ordinal cate-

gorical data [dissertation]. Stockholm: Almqvist & Wiksell International; 1993.

[25] Svensson E. Ordinal invariant measures for individual and group changes in ordered cate-

gorical data. Stat Med. 1998;17(24):2923-36.

[26] Svensson E., Starmark J.E. Separation of systematic and randm differences in ordinal

rating scales. Stat Med. 1994;13(23-24):2437-53.

[27] Yang Y. Augmented rank-order correlation coefficient [Working Paper]. Örebro Univer-

sity, 2012.

[28] Zhao X., Zhu Y. Rank-Based Statistical Methods for Paired Ordinal Data [Dissertation].

Örebro University, 2009.

[29] Svensson E. Concordance between ratings using different scales for the same variable.

Stat Med. 2000;19(24):3483-96.

[30] Svensson E. Comparison of the quality of assessments using continuous and discrete or-

dinal rating scales. Biometrical J. 2000;42(4):417–434.

37

Appendix A: Detailed Results About Spinal Fusion V.S. Non-surgical Treatment Based on BISTable A.1: The Difference of RP and RC Values for the Two Groups on the Items of the BIS

38

Table A.2: Measure of Agreement and Disagreement in Paired Assessments on the Items of the BIS

39

Appendix B: Detailed Results About Spinal Fusion V.S. Non-surgical Treatment Based on ODI

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Pain Intensity

Before Treatment

Afte

r Tr

eatm

ent

FusionComparison

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Personal Care

Before TreatmentA

fter

Trea

tmen

t

FusionComparison

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Lifting

Before Treatment

Afte

r Tr

eatm

ent

FusionComparison

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Walking

Before Treatment

Afte

r Tr

eatm

ent

FusionComparison

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Sitting

Before Treatment

Afte

r Tr

eatm

ent

FusionComparison

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Standing

Before TreatmentA

fter

Trea

tmen

t

FusionComparison

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Sleeping

Before Treatment

Afte

r Tr

eatm

ent

FusionComparison

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Sex Life

Before Treatment

Afte

r Tr

eatm

ent

FusionComparison

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Social Life

Before Treatment

Afte

r Tr

eatm

ent

FusionComparison

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Travelling

Before Treatment

Afte

r Tr

eatm

ent

FusionComparison

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Summary

Before Treatment

Afte

r Tr

eatm

ent

FusionComparison

Figure B.1: The Comparison of ROC Curves of the Fusion Group and the Comparison Group (ODI)

40

Appendix C: The Balanced Inventory for Spinal Disorders

41

Appendix D: Oswestry Low Back Pain Disability Index

46

Analysis of Spinal Fusion Versus Nonsurgical Treatment for Chronic Low Back Pain...

Documents

Transcript of Analysis of Spinal Fusion Versus Nonsurgical Treatment for Chronic Low Back Pain...