Analysis of Spinal Fusion Versus Nonsurgical Treatment for Chronic Low Back Pain...
Transcript of Analysis of Spinal Fusion Versus Nonsurgical Treatment for Chronic Low Back Pain...
Analysis of Spinal Fusion VersusNonsurgical Treatment for Chronic Low
Back Pain Based on Elisabeth Svensson’sMethod
Yingyan Zhu
Department of Statistics
Uppsala University
Supervisor: Adam Taube
2014
Abstract
The utility of spinal fusion for patients with chronic low back pain is controversial. The aim of
the current study is to find out whether spinal fusion benefits the patients or not. Additionally,
two outcome scales, the Balanced Inventory for Spinal Disorders (BIS) and Oswestry Low Back
Pain Disability Index (ODI), are compared to detect which one is preferable. A prospective
randomized study concerning spinal fusion has recently been performed in Sweden. 74 patients
aged 18 to 65 with low back pain lasting longer than one year were randomized into two
treatment groups: a fusion group and a non-fusion group. Evaluation of the physical sensations
based on several outcome scales were conducted both before and one year after the treatment.
Elisabeth Svensson’s Method, which measures the level of changes both from systematic and
individual aspects, was applied to analyse the effect of spinal fusion. Measures of disorder
(D) and order consistency (MA) were calculated when comparing the two outcome scales. The
conclusions are that spinal fusion has shown significant advantages over non-surgical treatment
methods, and that BIS is recommended over ODI.
Keywords: Chronic Low Back Pain; Spinal Fusion; Svensson’s Method; BIS
Contents
Abstract i
1 Introduction 1
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Previous Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 The Aims of the Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Methodology 6
2.1 Study Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Statistical Methods: Elisabeth Svensson’s Method . . . . . . . . . . . . . . . . 8
2.2.1 The Contingency Table . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.2 Rank Transformable Pattern of Change . . . . . . . . . . . . . . . . . 9
2.2.3 Systematic Change . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2.4 Individual Change . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2.5 The Measure of Disorder (D) and Order Consistency (MA) . . . . . . . 16
3 Spinal Fusion Versus Nonsurgical Treatment 18
3.1 Detailed Analysis of the Primary Endpoint Variable: Low Back Pain . . . . . . 18
3.1.1 The Percentage of Agreement . . . . . . . . . . . . . . . . . . . . . . 18
3.1.2 The Systematic Change . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.1.3 The Individual Change . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2 Analysis of all the items in BIS . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2.1 Descriptive Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2.2 The Percentage of Agreement . . . . . . . . . . . . . . . . . . . . . . 22
3.2.3 The Systematic Change . . . . . . . . . . . . . . . . . . . . . . . . . . 22
i
3.2.4 The Individual Change . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.3 Results Related to VAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4 Comparison of BIS and ODI 27
4.1 Comparison of the Primary Endpoint Variable . . . . . . . . . . . . . . . . . . 27
4.2 Comparison of Activities, Social Life and Sleeping . . . . . . . . . . . . . . . 29
4.3 Summary of the Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5 Summary 32
5.1 Discussion and Further Study . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.2 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Acknowledgements 34
References 35
Appendix 38
A Detailed Results About Spinal Fusion V.S. Nonsurgical Treatment Based on BIS . 38
B Detailed Results About Spinal Fusion V.S. Nonsurgical Treatment Based on ODI . 40
C The Balanced Inventory for Spinal Disorders (BIS) . . . . . . . . . . . . . . . . . 41
D Oswestry Low Back Pain Disability Index (ODI) . . . . . . . . . . . . . . . . . . 46
ii
Chapter 1
Introduction
1.1 Background
It is very common to suffer from back pain in daily life. According to the book "Low Back Pain
FAQs" published in 2007, back pain affects over 70% of people at a certain point in their lives
[6]. It is reported to be the most common cause of limited activities for those who are under 45
years old [2, 4, 6]. Although it might vanish itself after several days or weeks, back pain is the
second common symptom that requires visiting a physician [2, 7].
"Chronic Low Back Pain" is a kind of back pain in the lower spine, which lasts more than 12
weeks and may be a continuous problem [6]. Specifically, it is a nagging or aching pain from
the belt line to the tail-bone of the spine [6] (Figure 1.1). It happens mostly among people
more than 30 years old, especially those in ages between 45 to 65 [2, 6]. There are various
possible causes of a low back pain, such as poor posture, obesity, excessive physical labor,
fatigue and emotional stress [6]. Yet, the actual reason of its occurrence seems to be under-
investigated today [6]. The common treatment of chronic low back pain includes exercise,
massage, cognitive therapy, physical therapy, spinal fusion, etc.
For decades, spinal fusion, also known as "lumber fusion", "arthrodesis" or "lumbar spinal
fusion", has been used as a surgical way to relieve the low back pain [1, 12]. Initially it was used
to deal with infections, deformity, degenerative conditions, spinal tuberculosis and fractures,
and then gradually used to operate or cure lumbar instability and degenerative disc disease [1,
2, 3]. It minimizes or eliminates the pain by preventing the movement between vertebrae [2,
1
Figure 1.1: Spinal Column [6]
3]. For example, in some fusion techniques the disc is taken away and the two vertebrae are
put together in a single unit. There are several ways of spinal fusion, such as posterolateral
arthrodesis, lumbar interbody arthrodesis, posterior lumbar interbody arthrodesis, interbody
fusion cages, etc [2]. Each method is suitable in different situations of the patients and has
both advantages and disadvantages. So before a spinal fusion is performed, some specific
pathoanatomical diagnoses should be done in order to obtain a better timing for the surgery and
improve suitability of the fusion according to patient’s symptoms [2].
Undertaking spinal fusion is risky and costly. First, the radiography (X-ray) or similar methods
cannot show the exact segment of the spine in which the back pain occurs [6, 7, 8]. Sec-
ond, spinal fusion is associated with many complications like mortality, thrombosis, hepatitis,
gastrointestinal,urinary tract infection, pulmonary complication, skin problem, psychological
problems and coping problems [13]. Third, the costs are considered to be very high [2, 5, 9,
10].
The high costs, the increasing rates of re-operations and the risk for complications make the
question of whether spinal fusion actually helps the patients more and more urgent. Hence, it
is of great importance to conduct a research on this field.
2
1.2 Previous Studies
A lot of discussions have being going on whether spinal fusion should be recommended or not.
However very few clinical trials have been done. To the author’s knowledge, only four studies
have been conducted during the last 15 years in Europe. These studies have led to varying
conclusions (Table 1.1).
Table 1.1: Summary of the Four Previous Studies
The first study was a multicenter randomized controlled trial performed in Sweden by Fritzell
et al[14]. 294 patients were randomized into four groups, three of which were fusion groups
3
with different fusion techniques and one was a comparison group. 222 patients were assigned
to the three fusion groups and treated as one single group in the analysis. The comparison
group consisted of 72 patients. This study measured different aspects by using various scales
and the fusion group performed significantly better in almost all the aspects except "Depressive
Symptoms". The final conclusion was that fusion was more efficient than the commonly used
non-surgical treatment. Among the four recent studies, Fritzell et al.’s is the only contribution
that gives a positive conclusion for fusion. However, the big difference in number of patients
between fusion and non-fusion group has affected the results. Furthermore, Dr. Mooney ques-
tioned the criteria by which patients were assigned to a group [14], who thought these patients
were those who were not suitable for surgery. These two weaknesses of the study reduced the
reliability of its conclusion.
Two studies were performed by Brox and his colleagues [15, 16]. The first one was a random-
ized clinical trial with 64 (37+27) patients in Norway in 2003. The Oswestry Low Back Pain
Disability Index (ODI) was used as a primary scale and Mann-Whitney test was employed to
calculate the p-value for the difference in changes from baseline to 1-year follow-up. Accord-
ing to this single p-value (0.33), the author concluded that "The main outcome measure showed
equal improvement in patients with chronic low back pain and disc degeneration randomized
to cognitive intervention and exercises, or lumbar fusion" [15, p1913]. But if the secondary
outcome measure is also taken into consideration, the conclusion needs to be re-evaluated. The
secondary outcome measure has 12 aspects and six of them have p-values smaller than 0.05,
which means that at least in six out of twelve aspects significant difference between the two
groups can be seen. These results may indicate that fusion performs better with regard to at
least half of the aspects. The second study under the leadership of Brox was a similar random-
ized clinical trial with 60 (29+31) patients in 2006 has the similar problem [16]. Three out of
eight aspects showed significant differences between the two groups. But the final conclusion
was based on the single p-value concerned with the main measure, identified by the authors.
The research conducted by Fairbank et al [12] in 2005 in the United Kingdom was also a
multicenter randomized controlled trial, which involved 349 patients from 15 centres. Several
outcome scales were used in this study, and almost all the p-values of the difference in changes
were larger than 0.05. Hence, it was concluded that there was no clear evidence which shows
the spinal fusion has a significant benefit to the patients.
4
The last three studies reviewed above treated the results of the scales as scores which can be
regarded as continuous data and used statistical methods such as multiple regression which is
not appropriate for ordinal data. The outcome scales of such clinical research is an evaluation
of subjective physical sensations. These kinds of questionnaires are rating scales even if they
have a standing scoring method stated by the questionnaire itself. The mathematical operations
which were used were not appropriate to this kind of data.
Regardless of the statistical methods that have been used in these studies, the conclusions are
not consistent. So the uncertainty of whether the spinal fusion has a benefit to the patients with
chronic low back pain or not, still remains.
1.3 The Aims of the Study
The analysis of spinal fusion versus non-surgical treatment for chronic low back pain will be
conducted in this paper. The aim is to find whether the spinal fusion is of benefit to the patients
who suffer from chronic low back pain or not. For the analysis of the systematic and random
differences between paired ordinal categorical data, a rank-invariant non-parametric method,
which has not been used before in such randomized study related to spinal fusion, will be
applied in this study. Meanwhile, BIS and ODI will be compared in order to find out which one
is to be preferred.
5
Chapter 2
Methodology
2.1 Study Basis
A prospective randomized study concerning spinal fusion has recently been performed by the
Clinic of Spinal Surgery in Strängnäs, Sweden (CSS), an organization that is specialized in
spinal surgery. The patients involved into this study were randomized into two groups. One
group was given surgical treatment, while the other had the non-surgical treatment. (Detailed
treatment therapy for the two groups is listed in Table 2.1.) Each group has 37 patients, thus
there are a total of 74 patients involved in the study.
Table 2.1: Treatment Therapy for the Two Groups
Oral and written information were given to the patients before they were involved in the study
and informed consent were also given. The following criteria were followed for the inclusion
of patients [11]:
1. Age between 18 to 65;
2. The discomforts or symptoms of the low back pain has continued for at least one year;
3. Extensive conservative treatment that includes physical therapy should have been performed
at least seven weeks, but severe discomfort still continues;
4. No fusion surgery in the lumbar area is conducted before;
5. Patients should have distinctive symptoms such as severe low back pain located centrally
6
in the vertebral column being aching in character and combined with stabbing pain at sudden
movements, diffuse pain radiation down the legs and mostly a bladder dysfunction with urgency
and frequency.
These patients can be considered as a well-defined group with specific symptoms that can be
separated from other patients with severe back problems.
The operations were performed by Doctor Bo Nyström with micro-surgical techniques. The
patients in both groups were evaluated before the treatment and at one year after the treatment.
Several different outcome scales have been applied. They are the Balanced Inventory for Spinal
Disorders (BIS), Oswestry Low Back Pain Disability Index (ODI), 36-Item Short Form Survey
Instrument (SF-36) and European Quality of Life Scale (Euro-Qol). Among them BIS is the
newest one, and of special interest in this study. The others are already widely used in this field.
BIS and ODI will be compared later.
BIS is a questionnaire specific for detecting the influence of back and leg pain on physical
health, social life, mental health and the quality of life. The validity and reliability has already
been proved [17, 18]. It has 17 items in total for these four dimensions and the answering
alternatives are ordered categories like: "1. none, 2. negligible, 3. moderate, 4. rather severe,
5. very severe". The 17 items are: "Low Back Pain","Leg Pain","Limited Indoor Activities",
"Limited Outdoor Activities", "Physical Health", "Limited Leisure Activities", "Limited Social
Activities", "Limited Social Life", "Feeling Down", "Feeling Impatience", "Lack of Initiative",
"Concentration Difficulties", "Sleep Disturbance", "Overall Mental Health", "Limitation in Liv-
ing", "Perceived Concern" and "Overall Quality of Life". Two surveys about the evaluation of
patients’ physical sensations have been done. The BIS for the second time contains more ques-
tions about the changes after the surgery related to the corresponding item. These questions are
of the type: "How is your back pain today? (change in back pain at follow-up)" and the choices
of it are "1. no pain, 2. much better, 3. somewhat better, 4. unchanged, 5. somewhat worse, 6.
much worse". The whole questionnaire of BIS and ODI are shown in the appendix C and D.
According to the study protocol, the main variables are related to "Change in pain in the low
back and radiating to the legs and the discomfort related to limitations in activity one year
after treatment" [11, appendix 2, p3]. This so called primary endpoint contains a number of
variables. Therefore, in our analysis we have selected the first item in BIS, "Low Back Pain",
as the primary endpoint variable. All the other variables will be analysed in a similar manner.
7
2.2 Statistical Methods: Elisabeth Svensson’s Method
The answers obtained from the questions are rating scales. Although the answers are labelled by
numbers, they have only ordered meaning but not a numerical sense [19]. Addition, subtraction
and any parametric methods are inappropriate and have no empirical meanings [20]. So, the
differences between different scale points do not give an adequate measure for the change in
this study.
Methods like sign test and McNemar’s test are suitable for analysing ordered categorical data[21,
22, 23]. However, they only test whether a change between two occasions does exist or not,
and thus do not efficiently use the information from the data [21, 22]. A rank-invariant non-
parametric method has been developed by Svensson in 1993 for the analysis of ordered scales
which measures the level of change both from systematic and individual aspects [24, 25]. This
method will be used here to analyse the effect of the fusion.
Svensson’s method has a lot of good properties, but it is not yet widely used. Therefore, it will
be introduced in detail with the primary endpoint variable.
2.2.1 The Contingency Table
A contingency table like Figure 2.1 (a) is formed as a basis for Svensson’s method. The X-axis
stands for the observations before treatment while the Y-axis stands for the observations after
treatment. The numbers in each cell represent how many patients had chosen i:th category
before treatment and j:th category after. For example, the number "4" in the (Rather Severe,
Moderate) cell means there were four patients who felt the pain in the low back as "rather
severe" before the treatment and "moderate" after. Apparently, the treatment was effective for
these four people.
The sums of each row and column form the marginal distributions for the two occasions sep-
arately. The overall systematic change is evaluated with the help of this [22]. By comparing
the marginal distributions in Figure 2.1 (a) and (b), it is clear that more patients shifted their
choices to lower categories in the fusion group than in the comparison group for item "Low
Back Pain".
8
(a) Fusion Group (b) Comparison Group
Figure 2.1: The Frequency Distribution of the Pairs Before and After Treatment for "Low Back
Pain"
The lower-left to upper-right line of the contingency table is the agreement line where the
patients have given the same answer at the two time points. So the percentage agreement (PA)
which shows the proportion of the patients who did not change their choices is defined as:
PA =Sum of the Observations on the Agreement Line
Total Observations(2.1)
2.2.2 Rank Transformable Pattern of Change
Starting from the marginal distributions in Figure 2.1 (a), we can construct the so called rank
transformable pattern of change (RTPC) table (Figure 2.2) by means of the following rules:
1. The sum of the first column is zero, so only "0" observations can be put in the cell (None,
None);
2. The sum of the lowest row is 13, since the (None, None) cell has only zero observations,
"13" may be put in the cell (Negligible, None). But the sum of the second column is also zero,
only "0" can be put there. Then "13" can be put in the cell (Moderate, None). Since 13 is
smaller than the column total 14, "13" can be put there;
3. Only one observation is left for the third column, since one is smaller than the row total 11,
"1" can be put in the cell (Moderate, Negligible);
4. Then ten observations are left for the second lowest row, and ten is smaller than the column
total 18, so "10" can be put in the cell (Rather Severe, Negligible);
9
Figure 2.2: The RTPC Table of Item "Low Back Pain" for Fusion Group
5. Continuing with similar calculations until the sum of the whole table equals the total sample
size.
RTPC table shows the only pattern where the ranking of the patients is exactly the same both
before and after. RTPC is very important because it is the key to Svensson’s method which
describes the systematic shift between the two time points.
2.2.3 Systematic Change
2.2.3.1 Relative Position, RP
The systematic change in position between the two ordered categorical classification is called
the relative position (RP) and it is theoretically defined as P (X < Y )− P (Y < X), where X
and Y denote the two sets of categorical distributions for before and after treatments separately
[18, 22-24]. The empirical measure can be calculated as [24, 25]:
RP = p0 − p1 =1
n2
m∑v=1
yvC(X)v−1 −1
n2
m∑v=1
xvC(Y )v−1 (2.2)
where
n: the sum of the observations;
xv and yv: the vth category frequencies;
C(X)v and C(Y )v: the vth category cumulative frequency of the marginal distributions X , Y .
10
The value of RP ranges from -1 to 1 since it is the difference of two probabilities [22, 24]. The
value of RP will reach the lower limit -1 if the marginal distribution on occasion Y completely
shifts to X and vice versa [22, 24]. An RP close to zero indicates that negligible systematic
changes in position has occurred between the two occasions [22, 24]. In this case, if the RP
value is smaller than zero, it implies a systematic change towards lower categories, which is a
desired effect in this study.
2.2.3.2 Relative Concentration, RC
In order to have a comprehensive evaluation of the systematic change, the systematic change
in concentration (RC) is defined by P (Xl1 < Yk < Xl2) − P (Yl1 < Xk < Yl2), where Xl1 ,
Yk, Xl2 , Yl1 , Xk and Yl2 denote the discrete independent random variables [22, 23, 24]. The
empirical measure can be calculated by [24, 25]:
RC =1
Mn3
{m∑v=1
[yvC(X)v−1(n− C(X)v)− xvC(Y )v−1(n− C(Y )v)]
}(2.3)
where M = min[p0 − p20, p1 − p21].
The RC value ranges from−p(1−p) to p(1−p), where p = P (Xk ≤ Yl1). It will be larger than
zero when the marginal distribution of occasion Y is more concentrated relative to the marginal
distribution of occasion X [24, 26].
2.2.3.3 Standard Error and 95% Confidence Intervals of RP and RC
The measures of RP and RC are asymptotically normally distributed and the variance of the em-
pirical measures of them are estimated with the help of jackknife technique [21-24]. According
to Svensson’s proof, V AR(RP ) can be obtained by the following process:
σ2jack(RP ) =
n− 1
n
n∑κ=1
(RP(κ) −RP(•))2 (2.4)
where
RP(κ): the Relative Position with one observation, κ, deleted;
11
RP(•): the average of all possible values of Relative Position with one observation deleted,
κ = 1, ..., n.
Further,
V AR(RP ) = (n− 1
n)2σ2
jack(RP ) (2.5)
Then,
SE(RP ) =n− 1
nσjack(RP ) (2.6)
And the 95% confidence interval will be:
RP ± 1.96× SE(RP ) (2.7)
The standard error and 95% confidence intervals of RC can be obtain by analogous formulas
by changing all the RP’s to RC’s in formula 2.4-2.7.
2.2.3.4 ∆RP and ∆RC
In order to test if there are significant difference between the two groups, ∆RP = RPC−RPF ,
∆RC = RCC −RCF and the corresponding 95% confidence intervals are calculated.
The two groups are independent, so the RP and RC values for the two groups are independent.
The stand error of ∆RP and ∆RC can be obtained by:
SE(∆RP ) =√V AR(∆RP ) =
√V AR(RPC −RPF ) =
√V AR(RPC) + V AR(RPF )
(2.8)
SE(∆RC) =√V AR(∆RC) =
√V AR(RCC −RCF ) =
√V AR(RCC) + V AR(RCF )
(2.9)
12
where V AR(RPF ), V AR(RPC), V AR(RCF ) and V AR(RCC) can be obtained by formula
2.5.
The approximate 95% confidence intervals for ∆RP and ∆RC are:
(RPC −RPF )± 1.96SE(RPC −RPF ) (2.10)
(RCC −RCF )± 1.96SE(RCC −RCF ) (2.11)
2.2.3.5 ROC Curve
The systematic change can also be illustrated graphically by a relative operating characteristic
(ROC) curve. The ROC curve is obtained by plotting the pairs of cumulative proportional
marginal frequencies with starting point (0, 0) and ending point (1, 1). The relationship between
the ROC curve and the RP values is p0 = 12-A+B, p1 = 1
2-B+A, so RP = p0 − p1 =2(B-A) if
A and B represent the areas between the curve and the diagonal (Figure 2.3).
Figure 2.3: An S-shaped ROC curve Figure 2.4: The ROC Curves of the Fusion
Group and the Comparison Group for the 1st
item in BIS
A curve above the diagonal indicates a systematic change towards lower categories while a
concave curve below the diagonal stands for a systematic change towards higher categories.
The greater the deviation, the stronger is the systematic change between the two time points
[22-25]. Since a lower category stands for a better situation for the patients, it is obvious that
13
the fusion group performs far more better in relieving the back pain extent according to Figure
2.4.
Thus, we have already reached the basic conclusion concerning the primary endpoint in the
study protocol: Fusion is superior to non-fusion for these patients.
2.2.4 Individual Change
2.2.4.1 Augmented Mean Ranks
In order to fit the two rankings (before and after) together as much as possible without violating
the ordinary ranking, augmented mean ranks are calculated [24]. It is the rank related to pairs.
An observation in ij:th cell in a contingency table has two augmented mean ranks because it
can be calculated based on the occasion before the treatment or after. (Note: Here the i and j
stands for the i:th and j:th category for the two occasions respectively, but not the rows and the
columns for the table.) Notations R(1)ij and R(2)
ij stand for the two augmented mean ranks, where
"1" stands for the occasion "Before the Treatment" and "2" stands for that "After Treatment".
If it is based on the before treatment occasion, the observations in the ij:th cell will have the
ranks ranging from∑i−1
i1=1 xi1·+∑j−1
j1=1 xij1 +1 to∑i−1
i1=1 xi1·+∑j
j1=1 xij1 . Then the augmented
mean rank can be obtained by:
R(1)ij =
(∑i−1
i1=1 xi1· +∑j−1
j1=1 xij1 + 1) + (∑i−1
i1=1 xi1· +∑j
j1=1 xij1)
2
=i−1∑i1=1
xi1· +
j−1∑j1=1
xij1 +1
2(1 + xij) (2.12)
Similarly, R(2)ij can be calculated by:
R(2)ij =
j−1∑j1=1
x·j1 +i−1∑i1=1
xi1j +1
2(1 + xij) (2.13)
R(1)ij and R
(2)ij are only valid when xij ≥ 1, where xij is the frequency in ij:th cell and
i,j=1,2,...,m.
14
If R(1)ij = R
(2)ij for all ij, it is a rank transformable situation and all the differences between the
two occasions are caused by systematic changes. The greater the difference between these two
augmented mean ranks, the more individual changes are responsible for the differences before
and after the treatment [24].
2.2.4.2 Relative Rank Variance (RV) and Internal Rank Variance (IV)
The individual change is a random component of the disagreement between the two occasions
and can be described by the relative rank variance (RV), the observable part, and the internal
rank variance (IV), the unobservable part, together.
RV is related to the difference between the augmented mean ranks and can be calculated by:
RV =6
n3
m∑i=1
m∑j=1
(R(1)ij − R
(2)ij )2xij (2.14)
where i,j=1,2,...,m.
The value of RV ranges from zero to one and will never reach one. If RV=0, the disagreement
pattern between two occasion is rank transformable, which means no observable individual
changes.
Due to the restricted number of ordered categories, there will be unobservable individual vari-
ance [24]. IV is an empirical measure to catch this:
IV =1
n3
m∑i=1
m∑j=1
xij(x2ij − 1) (2.15)
The value of IV also ranges from zero to one and will never reach one. The smaller the value of
IV is, the more information is obtained from the sources of difference between two occasions
[24]. So for IV value, the smaller the better.
2.2.4.3 Augmented Correlation Coefficient, rα
A measure of the closeness of observations to RTPC is defined as the augmented correlation
coefficient (rα) [27, 28]. It links RV and IV together and can be calculated as:
15
rα = 1− n3RV
(n3 − n)− n3IV(2.16)
rα belongs to the Pearson correlation family and ranges from -1 to 1. The higher the value,
the lower is the individual variation from the RTPC [27]. rα = 1 means that all pairs of
augmented mean ranks are equal and that all the changes which happened after the treatment
can be explained by systematic changes [27].
A hypothesis test with H0 : rα = 1 versus H1 : rα < 1 can be conducted to test if any
individual changes do exist [27]. The test statistic is approximately normal distributed:
z = (rα − 1)√
(n− 1) (2.17)
Then a p-value can be easily obtained by various statistic software. If a p-value is smaller than
the standard significance level 0.05, the null hypothesis is rejected and individual changes are
considered to exist.
2.2.5 The Measure of Disorder (D) and Order Consistency (MA)
The measure of disorder (D) and order consistency (MA) are two measurements used for com-
paring the consistency between two occasions and will be used to compare whether BIS and
ODI are consistent or not.
Two different scales for the same variable can be illustrated by a contingency table [29] (eg.
Figure 4.2). The pairs on the main diagonal direction are considered as consistent. For instance,
the pairs of ("very severe" ODI,"rather severe" BIS) are consistent in ordering to ("fairly sever"
ODI, "moderate" BIS) [17]. On the contrary, the pairs in cells opposite the main direction are
considered as disordered.
The empirical measure of disorder (D) is defined as [17, 29, 30]:
D =2∑m1
i=1
∑m2
j=1 xijxlrij
n(n− 1)− t(2.18)
where
t =∑m1
i=1
∑m2
j=1 xij(xij − 1);
16
xij: ij:th cell frequency;
xlrij: the lower-right region frequencies relative to the ij:th cell;
i,j: the categories of the two scales, i = 1, 2...m1, j = 1, 2...m2.
The measure of order consistency (MA) is the difference between the proportions ordered and
discorded pairs. Hence:
MA = (1−D)−D = 1− 2D (2.19)
17
Chapter 3
Spinal Fusion Versus Nonsurgical
Treatment
3.1 Detailed Analysis of the Primary Endpoint Variable: Low
Back Pain
3.1.1 The Percentage of Agreement
Based on the data in Figure 2.1 (a), PA of the fusion group can be obtained by PA = 0+0+2+2+137
=
13.51%, while for Figure 2.1 (b) PA = 51.35%. This indicates that there are more disagree-
ment in the fusion group for "Low Back Pain".
3.1.2 The Systematic Change
According to formula 2.2, 2.3 and 2.7, the RP and RC value and the corresponding 95 % con-
fidence intervals of the fusion group can be calculated by the following process:
C(X): 0, 0, 14, 32, 37;
C(Y ): 13, 24, 32, 35, 37;
p0 = 1372
(13× 0 + 11× 0 + 8× 0 + 3× 14 + 2× 32) ≈ 0.07742878;
p1 = 1372
(0× 0 + 0× 13 + 14× 24 + 18× 32 + 5× 35) ≈ 0.794010226;
RP = p0 − p1 ≈ −0.7166.
18
M = min[0.07742878− 0.077428782, 0.794010226− 0.7940102262] = 0.07143356
RC =1
0.07143356× 373×[11× 0× (37− 0)− 0× 13× (37− 24)
+ 8× 0× (37− 14)− 14× 24× (37− 32)
+ 3× 14× (37− 32)− 18× 32× (37− 35)
+ 2× 32× (37− 37)− 5× 35× (37− 37)]
≈ −0.7246
σ2jack(RP ) ≈ 0.00697299
SE(RP ) = 37−137×√
0.00697299 ≈ 0.08124757
Then the 95 % confidence interval of RP can be easily obtained: (-0.8758, -0.5573). These
measures for the comparison group can also be calculated through the similar process. The
summary of all these measures of systematic changes for the two groups based on the main
variable are shown in Table 3.1.
Table 3.1: Summary of the Measurs for Systematic Change Based on the Main Varible
The negative RP value for the fusion group indicates that after surgery a lot of patients shifted
their choices to lower categories. The close-to-zero value of RP for the comparison group
reveals that those, who did not have the spinal fusion, did not experience an obvious change or
even got worse.
The RC value of the fusion group is smaller than zero, which means the marginal distribution
of "Low Back Pain" after the fusion is less concentrated relative to the marginal distribution
of that before the fusion. This is reasonable because the patients concentrated on choice "3.
moderate, 4. rather severe, 5. very severe" before the surgery while some shifted to "1. none, 2.
negligible" after (Figure 2.1 (a)). While the RC value for the comparison group is very close to
zero and can be neglected. In fact, the 95 % confidence interval of RP and RC for comparison
19
group both cover zero. So, the systematic change of the first item in BIS for the comparison
group can be considered as negligible.
It can be concluded that the treatment of the fusion group has some effect on the patients while
the treatment of the non-fusion group has not.
The 95% confidence interval of ∆RP and ∆RC for the main variable "Low Back Pain" can be
obtained by the following process:
∆RP = 0.0935− (−0.7166) = 0.8101 and ∆RC = 0.0385− (−0.7246) = 0.7631;
SE(∆RP ) =√
0.08124762 + 0.10050802 ≈ 0.12924;
SE(∆RP ) =√
0.10235152 + 0.11279232 ≈ 0.15231;
(RPC −RPF )± 1.96SE(RPC −RPF ) = 0.8101± 1.96× 0.12924 ≈ (0.5568, 1.0634);
(RCC −RCF )± 1.96SE(RCC −RCF ) = 0.7631± 1.96× 0.15231 ≈ (0.4646, 1.0616).
None of the 95% confidence intervals does cover zero, which means that there is a significant
difference in systematic changes between the groups. The conclusion is that the fusion group
performed better on variable "Low Back Pain".
3.1.3 The Individual Change
Based on formula 2.14-2.17, the value of the measures for individual changes can be obtained
and the results are shown in Table 3.2.
Table 3.2: Summary of the Measurs for Individual Change Based on the Main Varible
The value of IV, the unobservable part of the individual change, for both groups are very close
to zero, so that most of the individual changes are observable. If only RV values are considered,
the value for the surgical group is larger than that for un-surgical group indicating that the fusion
group has more individual variance. The p-value of rα is smaller than 0.05 for the fusion group,
while larger than 0.05 for non-fusion group. With the help of rα, it is clear that the individual
change can be neglected for the comparison group on item "Low Back Pain" while fusion has
different influence on patients with various situations.
20
3.2 Analysis of all the items in BIS
3.2.1 Descriptive Analysis
The median values of the items, which stand for the average level of the variables, are shown
in Table 3.3. For the column "First Survey", "Second Survey" and "b Question", the lower the
number is, the better is the situation. As for "Progress", it is the difference between the values
of the same variable for the first and the second survey. For this measure’s value, the higher
the better. It is obvious that the fusion group had more positive changes than the comparison
group since almost all the values of the "Progress" for fusion group are larger than that of the
comparison group. The "b Question" is about the change of the corresponding item within one
year after the operation. All the values of "b questions" for the fusion group are smaller than
that for the other group which also indicates that the patients in the fusion group experienced
more positive changes than those in the comparison group.
Table 3.3: The Average Performance on the Items of the BIS
21
3.2.2 The Percentage of Agreement
The percentage of agreement (PA) of the fusion group ranges from 8.11% to 37.84 %, while the
PA of the comparison group ranges from 36.11% to 70.27%. The lower PA of the fusion group
implies that there are more changes after the treatment for the fusion group. Improvements
are expected since the operations should be beneficial to the patients. So further studies of the
direction of the changes are desirable.
3.2.3 The Systematic Change
The systematic change can give us further explanation about whether the treatment really has
an effect on the patients or not. For the fusion group, all the ROC curves of the variables are
above the diagonal (Figure 3.1), which shows that the surgery had a positive effect on patients
for all items included in BIS. The same conclusion can be obtained from the fact that the RP
value of all questions for the fusion group are significantly smaller than zero. As for the RC
values, only four confidence intervals do not cover zero for the fusion group (Figure 3.2). The
corresponding items are "Limited Social Life", "Limitation in Living", "Perceived Concern"
and "Low Back Pain", which is already shown in chapter 3.1.2. (VAS is not included in the
BIS.) To get a closer picture of these variables, the marginal distributions of each variable are
shown in Table 3.4. It can be seen that they are concentrated around the last two or three choices
before the fusion and shifts to the first two or three choices after. So this does not influence
the conclusion that for the fusion group, patients choose lower categories after one year of the
surgery. Therefore, we will focus on the RP values when discussing the systematic change.
Table 3.4: The Marginal Distribution of the Four Variables for Both Situations in Fusion Group
The comparison group shows another pattern. In Figure 3.1, the ROC curve of the comparison
22
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Low Back Pain
Before Treatment
Afte
r Tr
eatm
ent
FusionComparison
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Leg Pain
Before Treatment
Afte
r Tr
eatm
ent
FusionComparison
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Limited Indoor Activities
Before Treatment
Afte
r Tr
eatm
ent
FusionComparison
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Limited Outdoor Activities
Before Treatment
Afte
r Tr
eatm
ent
FusionComparison
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Physical Health
Before Treatment
Afte
r Tr
eatm
ent
FusionComparison
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Limited Leisure Activities
Before Treatment
Afte
r Tr
eatm
ent
FusionComparison
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Limited Social Activities
Before Treatment
Afte
r Tr
eatm
ent
FusionComparison
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Limited Social Life
Before Treatment
Afte
r Tr
eatm
ent
FusionComparison
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Feeling Down
Before Treatment
Afte
r Tr
eatm
ent
FusionComparison
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Feeling Impatience
Before Treatment
Afte
r Tr
eatm
ent
FusionComparison
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Lack of Initiative
Before Treatment
Afte
r Tr
eatm
ent
FusionComparison
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Concentration Difficulties
Before Treatment
Afte
r Tr
eatm
ent
FusionComparison
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Sleep Disturbance
Before Treatment
Afte
r Tr
eatm
ent
FusionComparison
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Overall Mental Health
Before Treatment
Afte
r Tr
eatm
ent
FusionComparison
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Limitation in Living
Before Treatment
Afte
r Tr
eatm
ent
FusionComparison
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Perceived Concern
Before Treatment
Afte
r Tr
eatm
ent
FusionComparison
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Overall Quality of Life
Before Treatment
Afte
r Tr
eatm
ent
FusionComparison
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
VAS.Back
Before Treatment
Afte
r Tr
eatm
ent
FusionComparison
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
VAS.Leg
Before Treatment
Afte
r Tr
eatm
ent
FusionComparison
Figure 3.1: The Comparison of ROC Curves of the Fusion Group and the Comparison Group
23
group is close to the diagonal for almost all items, some of them even below the diagonal line.
Hence, treatment with physiotherapy and cognitive therapy, but without the fusion, does not
change the condition of the patients. Sometimes even a negative effect is found.
Figure 3.2: The Comparison of 95 % CI(RP and RC) for the Fusion Group and the Comparison
Group
Figure 3.3: The 95 % CI(∆RP and ∆RC) for Each Items on the BIS
∆RP and ∆RC value’s corresponding 95% confidence interval of all the items are shown in
Figure 3.3. (The exact value of ∆RP and ∆RC are displayed in appendix A.) For all the
variables, there is no 95% confidence intervals of ∆RP that covers zero, so the difference of
RP for the two groups are all statistical significant. Almost all the 95% confidence intervals of
24
∆RC values do cover zero, which means that the groups have no significant differences in the
change of relative concentration. But as stated before, this will not affect the final results. In
summary, we can conclude once again that spinal fusion has positive effect on the patients who
are suffering from the low back pain.
3.2.4 The Individual Change
Though the fusion seems effective for treating the low back pain, the influence may differ
among the individuals. All the IV values which measure the unobservable part of the indi-
vidual change before and after the treatment are negligible. So the individual changes can be
considered to be observable. Large RV and small rα values reflect that the significant individ-
ual variance may exist. The normal distribution is used to test the null hypothesis of rα = 1,
and the related p-values are shown in Table 4 in the appendix with italic number. Except "Low
Back Pain", which is mentioned in chapter 3.1.3, the p-values of the items "Limited Indoor
Activities", "Physical Health", "Limited Social Activities", "Limited Social Life", "Sleep Dis-
turbance", "Limitation in Living" and "Overall Quality of Life" in the fusion group are also
smaller than 0.05, so that the null hypothesis is rejected for the eight items and it is considered
that individual variance exists in these items. This reveals that there are significant individual
differences which are not covered by the systematic change. So the spinal fusion has varying
influence on the patients in these eight variables. Only two corresponding p-values in the non-
fusion group are smaller than 0.05, indicating that for the comparison group the changes are
mainly explained by the systematic change.
25
3.3 Results Related to VAS
The VAS scales are treated as ordinal data in this study. The PA value is very close to zero for
both groups. It is possible because VAS has 100 categories and this lowers the possibility of
that an individual will choose the same category at two separate time points. As with the RP
values or the ROC curves for the two groups, the conclusion is that the fusion group experi-
enced a positive outcome. However, the small rα reveals that individual differences should be
considered when deciding if a certain patient should have the fusion or not.
Detailed information about PA, RP, RC, RV, IV and rα values of each item for the two groups
are shown in the appendix A.
26
Chapter 4
Comparison of BIS and ODI
4.1 Comparison of the Primary Endpoint Variable
Similar analysis can be performed based on Oswestry Low Back Pain Disability Index (ODI)
and this gives the same conclusion in this study. (Detail results based on ODI when the answers
are treated as ordinal data can be found in appendix B.) According to formula 2.18 and 2.19,
measure of disorder (D) and order consistency (MA) are calculated for similar questions in
both outcome scales in order to detect if they are consistent with each other.
Table 4.1: D and MA Between the Pain Extent Assessments on the BIS and ODI
The value of MA is very high between "BIS 1: Low Back Pain" and "ODI 1: Pain Intensity"
before and after the fusion. Figure 4.1 gives the plots of the marginal distributions of the first
item for both questionnaires and situations (before and after) for the fusion group. Comparing
the two red lines in Figure 4.1 (a) and (b), the main reason of the disorder before the treatment
was BIS’s responses were more focused on "Rather Severe" while ODI’s responses were more
focused on "Moderate". If the contingency table is checked, twelve patients chose "Rather Se-
vere" in BIS but "Moderate" in ODI (Figure 4.2 (a)). On the other hand, after-fusion situation
27
010
2030
4050
BIS 1: Low Back Pain
Pro
babi
lity
Before TreatmentAfter Treatment
None Negligible Moderate Rather Severe Very Severe
(a) BIS
010
2030
4050
ODI 1: Pain Intensity
Pro
babi
lity
Before TreatmentAfter Treatment
No Pain Mild Moderate Fairly Severe Very Severe Unbearable
(b) ODI
Figure 4.1: The Marginal Distribution of the First Item of BIS and ODI (Fusion Group)
shows high consistency. In Figure 4.2 (b), the non-zero numbers are very close to the lower-left
to upper-right line. Consequently, the blue lines in Figure 4.1 (a) and (b) have very similar
shapes, which is consistent to the fact that the value of MA after fusion is higher than that be-
fore. What has been discussed above may also reveal that BIS is more sensitive on this question
since it has been focused on higher categories than ODI while changed to same concentration
after.
(a) (b)
Figure 4.2: The Contingency Table of the Pairs of the First Item for BIS and ODI (Fusion
Group)
The comparison group has a similar situation when comparing the consistency between "BIS
1: Low Back Pain" and "ODI 1: Pain Intensity". This indicates that BIS and ODI show very
28
similar results on the measure of low back pain.
By contrast, the value of MA between "BIS 2: Leg Pain" and "ODI 1: Pain Intensity" is
moderate, which reveals these two questions do not give the same information. Thus, BIS
provided more information on chronic low back pain than the other scales.
4.2 Comparison of Activities, Social Life and Sleeping
The value of D and MA between the questions about activities on both questionnaires are listed
in Table 4.3. It seems that BIS and ODI gives more different responses in this aspect because
the consistency is lower than that in the pain extent dimension.
However, BIS is still considered better than ODI in this area since the way it summarizes the
different activities is more reasonable and informative.
Social life and sleeping problems are the other two aspects which are included in both BIS and
ODI. The D and MA values are shown in Table 4.2. The consistency between the pairs related
to social life is not very high. BIS has two questions related to social life while ODI only has
one. However, readers may get confused since the two questions of social life on BIS has no
clear boundary. Therefore, ODI may give a better results in this aspect.
Table 4.2: D and MA Between the Social Life and Sleeping Assessments on the BIS and ODI
The questions concerning sleeping problems in both questionnaires have a high consistency.
Since no big differences between the situations before and after the treatment without fusion
are found, only the marginal distributions of before and after fusion are plotted in Figure 4.3.
29
Tabl
e4.
3:D
and
MA
Bet
wee
nth
eA
ctiv
ities
Ass
essm
ents
onth
eB
ISan
dO
DI
30
Before the fusion, the responses of BIS were more focused on "Rather Severe" while ODI’s
responses were more concentrated on "Moderate". After the fusion, both of them mostly shifted
to "None". Therefore, BIS seems to be more sensitive to this question since it has reflected more
changes.
010
2030
4050
BIS 14: Sleep Disturbance
Pro
babi
lity
Before TreatmentAfter Treatment
None Negligible Moderate Rather Severe Very Severe
010
2030
4050
ODI 7: Sleeping
Pro
babi
lity
Before TreatmentAfter Treatment
No Pain Mild Moderate Fairly Severe Very Severe Unbearable
Figure 4.3: The Marginal Distribution of the Sleeping Assessment of BIS and ODI (Fusion
Group)
4.3 Summary of the Comparison
Except what has been discussed above, BIS has more questions about mental health and overall
situations that have not been covered by ODI. Hence, BIS contains more information compared
with ODI. The most important point is that BIS uses descriptive scales but ODI uses numeri-
cal rating scales with a scoring instruction. As has been mentioned, scoring methods are not
appropriate for ordinal data.
Therefore, BIS is recommended over ODI for its descriptive scales and more information ca-
pacity.
31
Chapter 5
Summary
5.1 Discussion and Further Study
Since the actual reason for the chronic low back pain still remains unknown, it is difficult to
provide a "perfect" prescription to cure this disease. Although spinal fusion is an option, the
fact that no techniques can detect which is the exact segment that causes the pain today, makes
this option open to question.
Not very many randomized studies related to whether spinal fusion really eliminates the low
back pain have been done. Also, the statistical methods used do vary among these randomized
studies. The current study is a supplement in this area and a more appropriate method has
been applied in the statistical analysis. Svensson’s method is a rank-invariant non-parametric
method where no mathematical operation are included which are inappropriate for ordinal data.
Further, this method can isolate the systematic and individual changes. The conclusion using
this method is that the spinal fusion is to be recommend to the patients.
However, a lot of work concerning spinal fusion still remains to be done. First, which kind of
patients obtain more benefits from this surgery? Since the patients in the fusion group have
more individual variances based on this study, this question can be researched further. Second,
several ways of spinal fusion can be chosen, which surgical method is the most suitable for a
specific kind of patients? Third, more different follow-up time periods can be included. With
the passage of time, the situation will always change and several follow-up time periods will
provide more information of the changes. Forth, simulation studies can be done to compare
32
Svensson’s method with other widely used methods in order to detect the stability and the
reliability of each method in such randomized study.
5.2 Conclusion
The conclusion so far is that the spinal fusion had a significant positive effect on patients with
chronic low back pain in this study. Also, the Balanced Inventory for Spinal Disorders (BIS) is
superior to Oswestry Low Back Pain Disability Index (ODI) in randomized studies related to
spinal fusion.
33
Acknowledgements
Foremost, I would like to express my deepest gratitude and appreciation to my supervisor Prof.
Adam Taube for his excellent guidance, constructive suggestions, immense knowledge and
great patience. Besides, I would like to thank Dr.Bo Nyström for providing the fruitful idea
of this thesis and the data source and giving great support related to the medical parts in this
thesis. Also, thanks to Birgitta Schillberg, the co-worker of Dr.Bo, for collecting the answers
from the patients and arranging the data into the Excel. My sincere appreciation also goes to
Elisabeth Dahlqwist, a Master students in Statistic Department of Uppsala University, for the
sharing of her Bachelor thesis (co-author: Erika Åhlander) and the help of reading the Swedish
study protocol. For the assistance of R programming I want to thank Darlene LY. Dai, who is
a statistician work in NCE CECR Prevention of Organ Failure (PROOF) Centre of Excellence,
Vancouver, Canada. I thank my friend Cheng Gong and the teacher from English language
workshop,Jennifer Ast, for their advice on the language. Finally, my thanks will go to my
boyfriend who has never appeared in the past 25 years, which enables me to focus on my study
and research in statistics.
34
References
[1] Wiltfong R.E., Bono C.M., Charles Malveaux W.M.S. et al. Lumbar interbody fusion:
Review of history, complications, and outcome comparisons among methods. Current Or-
thopaedic Practice 2012;23(3):193-202.
[2] Edward N.H., Stephen M.D. Current Concepts Review: Lumbar Arthrodesis for the Treat-
ment of Back Pain. J Bone Joint Surg 1999;81-A(5):716-730.
[3] Deyo, R.A., Nachemson, A. Mirza, S.K. Spinal-Fusion Surgery - The Case for Restraint. N
Engl J Med 2004;350(7):722-726.
[4] Cherkin, D.C., Deyo, R.A., Loeser, J.D., et al. An international comparison of back surgery
rates. Spine 1994;19(11):1201-6.
[5] Katz, J.N. Lumbar spinal fusion: Surgical rates, costs, and complications. Spine 1995;20(24):78S-
83S.
[6] Gutknecht D. Low back pain FAQs. 1st ed. Hamilton, Ont.: BC Decker; 2007.
[7] Borenstein D., Calin A. Fast Facts : Low Back Pain (2nd Edition).Abingdon, Oxford, GBR:
Health Press Limited, 2012.
[8] Nyström B. Spinal fusion in the treatment of chronic low back pain: rationale for improve-
ment. The open orthopaedics journal. 2012;6:478.
[9] Deyo, R.A., Cherkin, D.C., Loeser, J.D. et al. Morbidity and mortality in association with
operations on the lumbar spine. The influence of age, diagnosis, and procedure. J Bone Joint
Surg - Ser. A 1992;74(4):536-543.
[10] Deyo, R.A., Ciol, M.A., Cherkin, D.C. et al. Lumbar spinal fusion: A cohort study of com-
plications, reoperations, and resource use in the Medicare population. Spine 1993;18(11):1463-
70.
[11] Nyström B. Ansökan Om Etikprövning 2007.
[12] Fairbank J, Frost H, Wilson-MacDonald J, Yu L-M, Barker K, Collins R. Randomised
controlled trial to compare surgical stabilisation of the lumbar spine with an intensive rehabil-
itation programme for patients with chronic low back pain: the MRC spine stabilisation trial.
35
Br Med J 2005; 330: 1233-9.
[13] Fritzell, P., Hägg, O. Nordwall, A. 2003, Complications in lumbar fusion surgery for
chronic low back pain: comparison of three surgical techniques used in a prospective random-
ized study. A report from the Swedish Lumbar Spine Study Group. Eur Spine J. 2003;12(2):178-
189.
[14] Fritzell P., Hägg O., Wessberg P., Nordwall A. Volvo award winner in clinical studies:
Lumbar fusion versus nonsurgical treatment for chronic low back pain. Spine 2001; 26(23):
2521-34.
[15] Brox J.I., Sörensen R., Friis A., et al. Randomized clinical trial of lumbar instrumented
fusion and cognitive intervention and exercises in patients with chronic low back pain and disc
degeneration. Spine 2003; 28(17):1913-21.
[16] Brox J.I., Reikerås O., Nygaard Ö., et al. Lumbar instrumented fusion compared with
cognitive intervention and exercises in patients with chronic back pain after previous surgery
for disc herniation: A prospective randomized controlled study. Pain 2006; 122: 145-155.
[17] Svensson E., Schillberg B., Kling A., et al. The balanced inventory for spinal disorders: the
validity of a disease specific questionnaire for evaluation of outcomes in patients with various
spinal disorders. Spine. 2009;34(18):1976-1983.
[18] Svensson E., Schillberg B., Kling A., et al. Reliability of the balanced inventory for spinal
disorders, a questionnaire for evaluation of outcomes in patients with various spinal disorders.
J Spinal Disord Tech. 2012;25(4):196-204.
[19] Svensson E. Guidelines to statistical evaluation of data from rating scales and question-
naires. J Rehab Med. 2001;33(1):47-48.
[20] Hand D.J. Statistics and the theory of measurement. J. Roy. Statist. Soc. Ser. A (Statistics
in Society). 1996;159(3):445-492.
[21] Svensson E., Starmark J., others. Evaluation of individual and group changes in social
outcome after aneurysmal subarachnoid haemorrhage: a long-term follow-up study. Journal of
Rehabilitation Medicine. 2002;34(6):251-259.
[22] Yang Y. Comparison of change between groups with data having rank-invariant properties
only [Dissertation]. Örebro University, 2009.
36
[23] Dahlqwist E., Åhlander E. Understanding elisabeth svensson?s method for paired ordered
categorical data and its applications on real life data [Dissertation]. Uppsala University 2012.
[24] Svensson E. Analysis of systematic and random differences between paired ordinal cate-
gorical data [dissertation]. Stockholm: Almqvist & Wiksell International; 1993.
[25] Svensson E. Ordinal invariant measures for individual and group changes in ordered cate-
gorical data. Stat Med. 1998;17(24):2923-36.
[26] Svensson E., Starmark J.E. Separation of systematic and randm differences in ordinal
rating scales. Stat Med. 1994;13(23-24):2437-53.
[27] Yang Y. Augmented rank-order correlation coefficient [Working Paper]. Örebro Univer-
sity, 2012.
[28] Zhao X., Zhu Y. Rank-Based Statistical Methods for Paired Ordinal Data [Dissertation].
Örebro University, 2009.
[29] Svensson E. Concordance between ratings using different scales for the same variable.
Stat Med. 2000;19(24):3483-96.
[30] Svensson E. Comparison of the quality of assessments using continuous and discrete or-
dinal rating scales. Biometrical J. 2000;42(4):417–434.
37
Appendix A: Detailed Results About Spinal Fusion V.S. Non-surgical Treatment Based on BISTable A.1: The Difference of RP and RC Values for the Two Groups on the Items of the BIS
38
Table A.2: Measure of Agreement and Disagreement in Paired Assessments on the Items of the BIS
39
Appendix B: Detailed Results About Spinal Fusion V.S. Non-surgical Treatment Based on ODI
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Pain Intensity
Before Treatment
Afte
r Tr
eatm
ent
FusionComparison
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Personal Care
Before TreatmentA
fter
Trea
tmen
t
FusionComparison
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Lifting
Before Treatment
Afte
r Tr
eatm
ent
FusionComparison
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Walking
Before Treatment
Afte
r Tr
eatm
ent
FusionComparison
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Sitting
Before Treatment
Afte
r Tr
eatm
ent
FusionComparison
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Standing
Before TreatmentA
fter
Trea
tmen
t
FusionComparison
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Sleeping
Before Treatment
Afte
r Tr
eatm
ent
FusionComparison
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Sex Life
Before Treatment
Afte
r Tr
eatm
ent
FusionComparison
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Social Life
Before Treatment
Afte
r Tr
eatm
ent
FusionComparison
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Travelling
Before Treatment
Afte
r Tr
eatm
ent
FusionComparison
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Summary
Before Treatment
Afte
r Tr
eatm
ent
FusionComparison
Figure B.1: The Comparison of ROC Curves of the Fusion Group and the Comparison Group (ODI)
40
Appendix C: The Balanced Inventory for Spinal Disorders
41
42
43
44
45
Appendix D: Oswestry Low Back Pain Disability Index
46
47
48