Estimating Companies’ Survival in Financial Crisis723216/... · 2014. 6. 10. · it is found that...
Transcript of Estimating Companies’ Survival in Financial Crisis723216/... · 2014. 6. 10. · it is found that...
Estimating Companies’ Survival in Financial Crisis
Using the Cox Proportional Hazards Model
By: Niklas Andersson
Independent Thesis Advanced Level
Department of Statistics
Supervisor: Inger Persson
2014
i
Abstract
This master thesis is aimed towards answering the question What is the contribution from
a company’s sector with regards to its survival of a financial crisis?with the sub question
Can we use survival analysis on financial data to answer this?. Thus survival analysis
is used to answer our main question which is seldom used on financial data. This is
interesting since it will study how well survival analysis can be used on financial data at
the same time as it will evaluate if all companies experiences a financial crisis in the same
way. The dataset consists of all companies traded on the Swedish stock market during
2008. The results show that the survival method is very suitable the data that is used.
The sector a company operated in has a significant effect. However the power is to low
too give any indication of specific differences between the different sectors. Further on
it is found that the group of smallest companies had much better survival than larger
companies.
Keywords: Survival Analysis, Survival Data, Time to Event, 2008 Financial Crisis
and Swedish Stock Market.
Acknowledgments
First and foremost I would like to thank my family for their continuous support through-
out my academical studies. Secondly I wold like to thank my supervisor, Inger Persson,
for her support, inputs and dedication.
Contents
1 Introduction 1
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Theory 5
2.1 Survival Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.1 Survival Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.2 Hazard Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.3 Censoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1.4 Estimations of Survival and Hazard . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1.5 Test of Group Differences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1.6 The Cox Proportional Hazards model . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.1.7 Assumptions, Goodness of Fit and Diagnostics . . . . . . . . . . . . . . . . . . . . 25
2.2 Key Performance Indicators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.2.1 Liquidity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.2.2 Solvency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3 Data 36
3.1 The Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2 The Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.3 An Event . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4 Results 40
4.1 Survival Functions and Cumulative Hazards . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.1.1 By Sectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.1.2 By Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.2 Cox Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.2.1 Individual Study Start . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.2.2 Day One Study Start . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.2.3 Final Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5 Summary and Conclusion 63
6 Bibliography 65
7 Appendix 1
7.1 Descriptive Statistics for stratums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
7.2 Cummulative Hazards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
7.3 Companies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
7.4 Exampel Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
ii
Introduction
Below follows a short background to this thesis outlining the problem and environment
that we will work with. This is followed by a short presentation of the method that will
be used and the main question that the thesis will strive towards answering.
1.1 Background
Stock markets have always been susceptible to financial crisis and this in turn affects each
company with stocks traded in the market. Each company will do its best to survive a
financial crisis and keep the investors beveling in it. In extension the company is trying
to keep the price of its stock from falling too far.
The reasons for financial crises have been plenty during the history and although not
all financial crises are rooted in a stock market many have come to affect these markets.
Kindleberger and Aliber (2005, ch. 1) report of the first known financial crisis recorded,
the ”Tulipmania” that originated in overinflated prices on tulip bulbs. Kindleberger and
Aliber (2005, ch. 1) also tell of more recent financial crises like the 1920s stock price
bubble or even more close at heart the real estate and stock crisis during the 80s and the
beginning of the 90s that affected Sweden, Finland and Norway amongst other nations.
In times like these we might see stock prices falling rapidly and far with huge con-
sequences for companies and financial institutions. But we do not expect every stock
to experience the same price fall, after all no companies are the same. Therefore their
reaction and to what degree they are affected by a financial crises differ from company
to company. We also expect companies and the price of their stock to react differently
from crisis to crisis since in the same way as no company is the same as another there
are always parameters that make each crisis unique .
During 2008 the world’s stock markets went trough a recession leading to huge price
falls in stock prices. Sweden was no exception and the general price index of the OMX
1
1. INTRODUCTION 2
Stockholm exchange closed at a mere 50.2% of its notation at the beginning of the year.
The recession affected all stocks traded on the Stockholm market and no stock that was
traded in the beginning of the year closed on a positive note by the end of december.
However as we have stated stocks are not the same and there ought to be differences
between them. One could speculate that the size of the company and in what sector
it operates could make a difference in the stock’s sensitivity for a recession like the one
in 2008. There were most likely also other factors, such as Key Performance Indicators
(KPI), specific to each company that affected to what extent the price on their stock fell.
After all it is the traders that set the price and their confidence and belief in a company’s
stock determines what price they are wiling to pay. Figure: (1.1)below shows that there
is some difference in how the index price for different sectors have developed during the
time of interest in this thesis.
Figure 1.1: Section indexes during 2008 (base value 2008-01-01)
As of now we do not know if these differences are significant, if the sectors were
effected differently during this time period. The primary interest of this thesis will be
to determine if the different sectors played a roll in how well a stock resisted price fall
during 2008.
1. INTRODUCTION 3
1.2 Method
What we want to find out in this thesis is if there were differences in each company’s
stock’s resistance to price fall during the 2008 recession, specifically if there where dif-
ferences due to different sectors/industries that the companies operated in. In order to
do this we have chosen to use the statistical methods of survival analysis and the Cox
proportional hazards modeling.
We will be working with survival time data or Time to Event data where the depen-
dent variable is the time from when we start observing something until a specific event
occurs. When working with this kind of data the Cox proportional hazards model is the
by far most popular method. In fact according to Allison (2010, ch. 5) the original paper
written by Sir David Cox in 1972 where he presents the proportional hazards method
had been cited over 1000 times in 2009 making it the most cited paper in statistics and
earning it a place on the top 100 most cited papers of science. There are many reasons
for this popularity and the foremost is probably the fact that the model itself does not
need any information about some underlying distribution that we expect the survival
time to follow. This makes the Cox method semi-parametric and sets it aside from other
parametric models where you need to make a decision about the underlying distribution.
This makes the Cox proportional hazards model more robust. How this semi-parametric
model works we will see in Section: (2.1.6). There are also other reasons for the methods
popularity and amongst them is the fact that it is relative easy to include time depen-
dent covariates and it can use both discrete and continuous measurements of the Time
to Event. However this said, the Cox method is not a universal method and there are
times where a parametric model based on known distributions is preferred. Also using
a semi-parametric method will result in higher standard deviations while estimating pa-
rameters as we will see in Section: (2.1.4). (Allison, 2010, ch. 5)
In this thesis the Cox method will be used since the the underlying hazards is un-
known in our data and the dependent variable will be a Time to Event variable since we
are interested in effects that make a company survive longer in a financial crisis.
1. INTRODUCTION 4
In the research leading up to this paper I have found few cases of Time to Event data
and thus Cox proportional hazards modeling being used on financial data. The lack of
previous research suggests that it is not commonly used in this field however Ni (2009)
has successfully used this method in a paper focused on the effect from a number of share
(company stocks) related KPI:s. She defined a certain price fall in the companies’ stocks
(same for all stocks) to be the event of interest (essential in survival analysis) in order
to find each company’s survival time. In essence the method she used is the same as the
one used in this paper however the focus will be shifted towards the contribution of the
sector in which a company operates.
With the given background this thesis will strive towards answering the following
questions: What is the contribution from a company’s sector with regards to its survival
of a financial crisis? with the sub question Can we use survival analysis on financial
data to answer this?
As already mentioned all crises are not the same thus this thesis will be limited to
the crisis and following price fall that took place during 2008. In the same way not
all markets are the same and a second limitation will be that only stocks traded on the
OMX Stockholm exchange market will be included in the study. The dataset will initially
consist of stocks traded at the beginning of 2008 and they will be studied throughout the
entire year.
Theory
In this section we will look closer at the theories that will be used as a foundation in
this thesis. We will cover theory regarding survival analysis in general and the Cox
Proportional Hazards model as well as looking at financial theory regarding KPI:s that
can influence the investors’ decisions.
2.1 Survival Analysis
In survival analysis we are mainly concerned with studying time to event data. That
is, we have a specific event that either will or will not happen for the observations of
our study. This kind of data can occur in a wide number of fields such as medicine,
engineering and economics. A simple example of an event can be the death of a patient
or a specific diagnosis within the field of medicine while for an engineer it could mean the
breaking of a component in a machine. It is then the time to this specific event, survival
time that we will study in order to find the survival time of each observation. Of course
the event does not have to be something negative but could rather be a remission after
a treatment. (Klein and Moeschberger, 2005, ch. 2.1)
Throughout this theory section we will give examples on graphical and numerical re-
sults by applying the theories on a dataset created for this purpose. The event in this
dataset is, as it will be when we look at our main dataset, negative and is concerning
the time it takes for a mechanical component to brake. The dataset and its variables is
presented in depth in Appendix: (7.4).
5
2. THEORY 6
2.1.1 Survival Function
We will let the time to our event of interest occurs be T , then we can characterize the
distribution of T by using the Survival Function. This function tells us the probability
of an individual (observation) surviving beyond a specific time, t, or the probability of
experiencing the event after the time t (Klein and Moeschberger, 2005, ch. 2.2). We can
define the survival function as:
S(t) = Pr(T > t) (2.1)
The reason for the the survival function being Pr(T > t) rather than Pr(T = t) is that
some observations in the study will not experiencing the event during our study and thus
there time to event is unknown. These observations are called censored observations and
in Section: (2.1.3) we will lock closer at why and how censoring occur
The survival function itself is in fact the complement of the cumulative distribution
function [S(t) = 1− F (t)] or just as well the integral of the probability density function[S(t) =
∫∞tf(u)du
]. If T is continous then S(t) is also continous and a strictly decreasing
function. At time 0 the survival probability will always be 1 and as time goes towards in-
finity the probability of survival will go towards 0.(Klein and Moeschberger, 2005, ch. 2.2)
(Klein and Moeschberger, 2005, ch. 2.2) also report what happens when T is not
continuous. This is often the case in survival analysis due to lack of precision in the
measurement of time. A medical study might only measure for their event at check-ups
done at some time interval, such as once a year. In that case the researcher are only able
to determine that the event has happened between the last check-up and this check-up.
Then we have to tweak the definition of our survival function. As we will se in Section:
(3) this is the case in our study since we only can observe the event once a day. Assume
that time to event is discrete and can take on the values ti, i = 1, 2, 3, . . . then:
p(tj) = Pr(T = ti), i = 1, 2, 3, . . . where t1 < t2 < t3 < · · ·
which gives
S(t) = Pr(T > t) =∑ti>x
p(ti) (2.2)
2. THEORY 7
2.1.2 Hazard Function
Another way of looking at the survival time is trough the Hazard Function or Hazard
Rate [h(t)]. This function tells us the probability of experiencing the event in the next
instant conditioned on the fact that the event has not happened up to that point in
time. This in turn describes the probability of experiencing the event over time and how
that probability changes with time. The hazard rate is defined as (see e.g Klein and
Moeschberger (2005, ch. 2.3)):
h(t) = lim∆t→0
P [t ≤ T < t+ ∆t|T ≥ t]
∆t(2.3)
and if T is continuous then,
h(t) =f(t)
S(t)= −∂ln[S(t)]
δt(2.4)
Cumulative Hazard Function
Closely related to the hazard function is the Cumulative Hazard Function which as the
name suggests accumulate the hazard rates over time. This definition is also given by
Klein and Moeschberger (2005, ch. 2.3) for a continuous T :
H(t) =
∫ t
0
h(u)du = −ln[S(t)] (2.5)
where
S(t) = exp[−H(t)] = exp
[−∫ t
0
h(u)du
](2.6)
If we do not have continuous T then the cumulative hazard function is
H(t) =∑ti≤t
h(ti) (2.7)
2. THEORY 8
2.1.3 Censoring
When working with survival data one thing will almost always be a problem to consider,
censoring. Kleinbaum and Klein (2012, ch. 1.2) describe that censoring occurs when we
have some information about an observation’s survival time without knowing the exact
time. A simple example is when a patient in a study is no longer followed while the event
of interest has not yet happened. In this case we know that the patient ”survived” up
until the point where we stopped following that patient but not for how long afterwards.
Kleinbaum and Klein (2012, ch. 1.2) give three general reasons for censoring.
• The study ends without the event occurring for an observation
• The observation is lost to follow-up
• When working with people they can choose to withdraw from the study
In the end of a study all observations will either have experienced the event itself or
they will be censored. However not all cases of censoring are the same and it is common
to divide censoring into two major categories, right and left censoring which is determined
by how the data is collected.
Right Censoring
Right censoring is when the event is observed only if it occurs before a predetermined
time. This is the simplest form of right censoring and is often called Type I censoring.
An example of this is when a study starts off with a number of patients in whom we
await an event. However due to cost the study might end before all individuals have
experienced the event thus those individuals will be right censored. In this simplest form
all individuals that are censored when the study ends will have been observed for the
same length (from the start to the end of the study) and have the same censoring time.
(Klein and Moeschberger, 2005, ch. 3.2)
Similar to the Type I censoring is the generalized Type I censoring which as in the
basic Type I case has a predetermined end date of the study. In this case though the
starting date of all individuals are not the same, rather they enter the study at individual
times. As a consequence each observation will have a different fixed censoring time and
2. THEORY 9
when the study ends those observations that are time censored will not have the same
study time. (Klein and Moeschberger, 2005, ch. 3.2)
Another common type of right censoring that is often used within the field of tech-
nology according to Klein and Moeschberger (2005, ch. 3.2) is the Type II censoring. In
this type of censoring a number of observations (n) is entered in the study at the same
time but instead of a pre-determined time at which the study ends a number of events
(r) is chosen. We have the condition that r < n in this case and when r amount of the
observations have experienced the event the study ends. One property of this kind of
censoring is that the data will consist of the r smallest event times from a random sample
n.
A pseudo case of right censoring that is also common in survival analysis which can
coexist with many other types of censoring is Random Censoring. Random censoring oc-
cur when we, for some reason, no longer can study our observation/individual although
we know that it has not yet experienced the event. Random censoring implies that the
censoring time and the event time are independent. This might come about due to a
patient in a medical study moving outside of the area that hospital performing the study
operated in an thus the patient is lost to further follow ups which in turn leads to the
event never being observed although the study itself is still ongoing. These cases are
censored at the time when they are lost and not, if for example type I censoring is used,
at the end of the study. (Klein and Moeschberger, 2005, ch. 3.2)
In our study random censoring will be present. This might be due to a number of
things such as merges, buy ups, or termination of trade. Other types of censoring exist
such as Left Censoring and Truncation but these will not be precent in this thesis. For
information about other types of censoring se i.g Klein and Moeschberger (2005, ch 3) or
Kleinbaum and Klein (2012, ch 1)
2. THEORY 10
2.1.4 Estimations of Survival and Hazard
In the next two sections we will look closer at estimations of survival time and hazard
rates. There are two different methods that we will cover which are both applicable when
working with right censured data. The focus on right censured data is due to the fact
that this is the kind of censoring present in our dataset. The dataset itself is outlined in
detail in Section: (3). The two methods that we will look at are the Kaplan-Meier (K-M)
or Product-Limit estimator and the Nelson-Aalen (N-A) estimator. As we will see both
these methods can be used to derive the survival function and the cumulative hazard
function. Since our dataset will be right censored we will only focus on these methods.
When presenting and working with the methods we will use the following notations and
assumptions. We will have a total of n individuals or observations and each of those will
hold information about their time to event ti as well as if they experienced an event or were
censored. The data will be discrete thus allowing for possible ties and the event/censoring
can only be observed at specific time points thus t1 < t2 < t3 < · · · < tD. t1 represent the
first time at which we can observe an event, ti the ith time point and tD the time point at
which the last event is observed. At a given time point we know the number of individuals
that are risking to experience the event Yi and we can observe di number of events. If
di > 1 we will have a tie at time point ti. The ratio diYi
can thus be interpreted as the
conditional probability of experiencing the event at time i if you have not experienced it
at any previous times. This quantity will be used in both methods in order to estimate
the survival function S(t) and the cumulative hazard function H(t). These notations are
equivalent to those used by Klein and Moeschberger (2005) in their presentation of these
estimates with the assumption that the censoring time is unrelated to the event time
(there is no information about the time to the event for a censored observation).
Kaplan–Meier
The Product-Limit estimator of the survival function was introduced by Kaplan and
Meier (1958) and thus is also called the Kaplan-Meier estimator. This estimator has a
fault that we need to be aware of, it is only well defined for the time interval that has
been observed in the data, t1 − tD. In other words beyond the largest observed time the
estimations are not reliable (Klein and Moeschberger, 2005, ch. 4.2). The K-M estimator
2. THEORY 11
is defined as (see e.g Klein and Moeschberger (2005, ch. 4.2)):
S(t) =∏ti≤t
[1− di
Yi
], for t1 ≤ t (2.8)
If the condition of t1 ≤ t is not met then S(t) = 1 , that is the probability of survival
before the first observation is 1. The variance of these estimates can be derived trough
the Greenwood’s formula (see e.g Klein and Moeschberger (2005, ch. 4.2)):
V [S(t)] = S(t)2∑ti≤t
diYi(Yi − di)
(2.9)
Earlier we have pointed out the connection between the Survival function and Hazard
Rates trough which we can use the K-M estimator to estimate the cumulative hazard
function as well as the survival function. By taking −ln to the estimated survival func-
tion at each time point we can acquire an estimation of the cumulative hazard ratio for
that time point, −ln[S(t)] = H(t).
Example: In Figure: (2.1) the survival function for the example data is estimated
using the Kaplan-Meier estimator. The time to event data is discreet and the survival
function have a step or stair shaped downward slope. The survival function is as discussed
in Section: (2.1.1) strictly decreasing. The last observation (in time) is censored thus the
survival function ends in a horizontal line.
Figure 2.1: Estimated survival function on the exaple data
2. THEORY 12
Nelson–Aalen
Although the K-M estimator can be used to estimate the cumulative hazard rate the
Nelson-Aalen estimator of the cumulative hazard is considered to have better low sample
properties and will therefore be used in this thesis (Klein and Moeschberger, 2005, ch. 4.2).
The estimator was developed by Nelson (1972) and Aalen (1978) where Aalen in the same
publication gave the expression for the variance of this estimator that is given in (2.11).
The estimator itself is given below in (2.10) as it is presented by Klein and Moeschberger
(2005, ch. 4.2).
S(t) =∑ti≤t
[1− di
Yi
], for t1 ≤ t (2.10)
If the condition of t1 ≤ t is not met then H(t) = 0 , that is the cumulative hazard
before the first observation is 0.
σ2H(t) =
∑ti≤t
diY 2i
(2.11)
As with the K-M estimator this estimator is well defined up to the largest observation
of t. Just as we could transform S(t) to be an estimator of H(t) trough −ln[S(t)] = H(t)
we can transform the N-A estimator to be an alternative estimator of the survival func-
tion, S(t) = −exp[H(t)] . (Klein and Moeschberger, 2005, ch. 4.2)
Example: Below in Figure: (2.2) the cumulative hazards are estimated on the exam-
ple data using the Nelson-Aalen estimator. The cumulative hazards is strictly increasing
since it is the cumulative of the estimated hazard at each point in time. The last obser-
vation (in time) is censored thus the cumulative hazards ends in a horizontal line in the
same way as the survival function in Figure: (2.1).
2. THEORY 13
Figure 2.2: Estimated cumulative hazards on the exaple data
2.1.5 Test of Group Differences
When we are studying a variable that can be divided into two or more groups we might
be interested in testing if there is a difference between these groups regarding their sur-
vival/hazard. Example of these kinds of studies where different survival between groups
might be of interest is in the field of medicine where you can have three different groups
of patients all suffering of the same disease but treated with different medicines. Thus the
primary interest of that study might be to determine if there is a difference in survival
between the groups and in extension possibly also between the medicines.In the same
way an engineer might desire to test if there is a difference between how long a com-
ponent lasts before breaking in a machine depending on which manufacturer made that
component. In order to keep the notation equivalent to that of Klein and Moeschberger
(2005) we will let K represent the groups where Kj represents the jth group. Also τ will
represent the last point in time where there is one observation still at risk for all groups.
The test itself focuses on the hazard at each time point where there is an observed
event and determines if there is a differenc between the groups at that time point. A weak-
ness with this kinds of multigroup tests that is pointed out by Klein and Moeschberger
(2005, ch. 7.3) is that it tests if at least one population differs from the others at any
2. THEORY 14
point in time. In other words while it can detect if there is a significant difference between
the groups, it only tells us if there is a difference between one, not which, group and the
rest. The hypothesis is:
H0 : h1(t) = h2(t) = h3(t) = · · · = hK(t), for all t ≤ τ
H1 : at least one of hj(t) differ for one, or more, t ≤ τ(2.12)
with the test function
Zj(τ) =D∑i=1
Wj(ti)
[dijYij− diYi
], J = 1, . . . , K. (2.13)
This test function will be a corner peace in creating the test statistic. In (2.13) Wj(ti)
is representing a weight that would/could be different for each group. However this is not
the case in the most common variations of this test, some of which we will cover below,
thus (2.13) can be simplified. Those tests that will concern us use a weighting function
where Wj(ti) = W (ti)Yij thus W (ti) is the same for all groups and Yij is the numbers at
risk in the jth group at the tthi time. Using this we can simplify (2.13) to (2.14). (Klein
and Moeschberger, 2005, ch. 7.3)
Zj(τ) =D∑i=1
W (ti)
[dij − Yij
(diYi
)]j = 1, . . . , K. (2.14)
Further on the variance and covariances of Zj(τ) from (2.14) are given by:
σjj =D∑i=1
W (ti)2YijYi
(1− Yij
Y i
)(Yi − diYi − 1
)di, j = 1, . . . , K (2.15)
and
σjg = −D∑i=1
W (ti)2YijYi
YigYi
(Yi − diYi − 1
)di, g 6= j (2.16)
In both 2.15 and 2.16 we find the term(Yi−diYi−1
)which will be equal to one in all cases
except when two observations have the same time to event. Thus this term corrects for
possible ties. All possible Zj(τ) are linearly dependent and the test statistic that we
will use is produced by excluding any one of the Z ′js. From this choice we can get the
estimated (K − 1)(K − 1) variance-covariance matrix Σ. This matrix will then be used
2. THEORY 15
in the test statistic (2.17). The test statistic is on quadratic form and when the null
hypothesis is true and we have a large sample it is chi-square distributed with K − 1
degrees of freedom. That means that when using significance level α to test H0 the test
will be rejected when χ2 is larger than the αth upper percentage of a χ2K−1 distribution.
(Klein and Moeschberger, 2005, ch. 7.3)
χ2 = (Z1(τ), . . . , Zk−1(τ))Σ−1(Z1(τ), . . . , Zk−1(τ))t (2.17)
In the special case where K = 2 the test statistic in (2.17) simplifies to:
Z =
D∑i=1
W (ti)[di1 − Yi1
(diYi
)]D∑i=1
W (ti)2 Yi1Yi
(1− Yi1
Y i
) (Yi−diYi−1
)di
=Z1(τ)
σ1(2.18)
Where Z is standard normally distributed.
There is a multitude of possible variation on this test due to the weight function and
altering this weight function will give the test different properties. The arguably most
simple weight is 1 where all time points have the same importance. This example is
commonly known as the Log-Rank test. Below follows a list of some variations on the test
presented by Klein and Moeschberger (2005, ch. 7.3) and Kleinbaum and Klein (2012,
ch. 2.5-2.6).
Log-Rank test
W (t) = 1: All time points have the same weight which makes it ideal when the hazard
rates in all groups are proportional.
Gehan’s or Wilcoxon Test
W (ti) = Yi: More weight is laid at time points where more people are in the risk group.
That is, weighted towards earlier failures. This test is best if we have reasons to suspect
that ”treatment” effect is strongest early on in the research. Focus is on differences early
in the study..
2. THEORY 16
Tarone-Ware test
W (ti) = f(Yi) where f(y) =√y: This test puts most weight to differences between the
observed and expected events at time points where there is the most data.
In Section: (4.1) both the Log-Rank and Wilcoxon tests will be presented and looked
at when determining significant difference between different sectors and company sizes.
This in order to find both general differences throughout time with the unweighted Log-
Rank test at the same time as possible differences early on can be detected using the
Wilcoxon test. In our study we will have the most data early on thus the Tarone-Ware
test wild also put emphasis on early differences and it will not be used in favor of the
Wilcoxon test.
Example: Now lets implement this group test on our example data. The Factor vari-
able consist of two different entries Factor1 and Factor2 which we can stratify the survival
time by. First in order to get a graphical image of the differences Figure: (2.3) shows the
estimated survival function stratified by the Factor variable.
Figure 2.3: Estimated survival functions on the exaple data stratifyed by Factory
2. THEORY 17
Since the red line for Factor2 is consistently below the line for Factor1 we can suspect
that Factor1 produces components with longer survival than Factor2. However we need
to look at the test results in Table: (2.1) to confirm this. Looking at Table: (2.1) we se
that nether of the test statistics are significant on the 5% level thus there are (so far) no
significant differences between the two factories.
Table 2.1: Tests of equality in the example data stratifyed by Factory
Test Chi-Square DF Pr> Chi-Square
Log-Rank 0.4644 1 0.4956
Wilcoxon 1.2141 1 0.2705
2.1.6 The Cox Proportional Hazards model
Sometime there is more than just some group variable that differentiate the individuals
in a study. In such a case the method presented in Section: 2.1.5 would not suffice. What
we need is a more complex model that can take multiple variables into account in the
same way as an ordinary linear model does. The proportional hazards model is such a
model and it was presented by Cox (1972). Thus the model is often referred to as the
Cox proportional hazards model.
Sticking with the notations of Klein and Moeschberger (2005) we will let Tj denote
the time individual j has been/was in the study, δj is indicating if the individual has
experienced the event (δj = 1 if the event has occurred) and Zj(t) is a vector consisting
of the k = 1, . . . , p observed covariates for individual j at time point t. In total there are
n individuals (j = 1, . . . , n). For simplistic reasons and due to the fact that there will be
no time dependent variable in this study we can let Zj(t) = Zj. Using these inputs we
can create the model that was suggested by Cox (1972) where h(t|Z) is the hazard rate
at time t and Z is the covariate vector.
h(t|Z) = h0(t)c(βtZ) (2.19)
2. THEORY 18
In this equation h0(t) represents a baseline hazard, that is, ignoring all other variables
there will still be some hazard attributed toward experiencing the event. The fact that
this proportional hazard model allows for the actual distribution of the survival to be
unknown, or unspecified, is a key feature that is part of what has made the model very
popular. This is also what makes the model semiparametric. By specifying h0(t) one
could acquire other linear models such as the exponential or Weibull models. (Allison,
2010, p. 126-127)
In Equation: (2.19) c(βtZ) is a known function and a requirement is that h(t|Z) must
be positive. According to Klein and Moeschberger (2005, ch. 8.1) a commonly chosen
function for c(βtZ) is:
c(βtZ) = exp(βtZ) (2.20)
This is also the function that Cox (1972) uses. In this equation,
βt =(β1 β2 β3 . . . βp
)and Z =
Z1
Z2
Z3
...
Zp
(2.21)
thus
βtZ = β1Z1 + β2Z2 + β3Z3 + · · ·+ βpZp =
p∑k=1
βkZk (2.22)
and we can rewrite Equation: 2.19 to
h(t|Z) = h0(t) exp(βtZ) = h0(t) exp
(p∑
k=1
βkZk
). (2.23)
2. THEORY 19
Hazard rate
So why is it called “proportional hazards”? This is due to the fact that we can use the
Cox Proportion Hazards model to look at the difference in hazards (of experiencing the
event) between two individuals with different covariate values via the hazard rate: (Klein
and Moeschberger, 2005, ch. 8.1)
h(t|Z)
h(t|Z∗)=
h0(t) exp
(p∑
k=1
βkZk
)h0(t) exp
(p∑
k=1
βkZ∗k
) = exp
[p∑
k=1
βk(Zk − Z∗k)
](2.24)
This is a ratio of the two different individuals’ hazards where Z and Z∗ are the
sets of covariates for the first and second individual respectively. This ratio will be
a constant which means that the hazard rates are proportional over time between two
different individuals. The proportionality of the hazards between to individuals is the key
assumption of this model. The easiest example of describing how to interpret this ratio
is when only one covariate differs between the two individuals. Klein and Moeschberger
(2005, ch. 8.1) use the case of one individual receiving a medicine (Z1 = 1) while the
other individual receives a placebo (Z1 = 0) and all other covariates (Z2 − Zp) have the
same value. In this case Equation: 2.24 would result in,
h(t|Z)
h(t|Z∗)= exp(β1) (2.25)
where exp(β1) is the risk of the event occurring if an individual have received the
medicine, in comparison with placebo. If the medicine improves the survival then the
ratio will be less than 1.
Estimating β using Partial Maximum Likelihood
We can estimate the values of β when there are no ties in the time to event by the partial
likelihood in Equation: 2.26. In the next section we will explore the method when ties are
pressent. The partial likelihood function was also introduced by D. R. Cox in his article
from 1972 and it is called partial due to the fact that the baseline hazard (h0(t)) can be
unknown and is left out of the covariate estimation. Due to this lack of information in
the likelihood function the standard errors will be larger than if the whole model had
been used to estimate the coefficients, however this is not a huge problem. The gain is
2. THEORY 20
that the estimates still will have good properties without any knowledge of the baseline
hazard and they will be consistent and asymptotically normal for large samples. (Allison,
2010, p. 126-129)
This likelihood can be derived from Equation: 2.23 as presented by Cox (1972):
L(β) =D∏i=1
exp
(p∑
k=1
βkZ(i)k
)∑
j∈R(ti)exp
(p∑
k=1
βkZjk
) (2.26)
In this equation Z(i)k is the kth covariate associated with the individual that have
failure time ti. The nominator only consists of the individual that experience the event
at time ti and the denominator is defined as the conditional set of individuals who were
still in the study just before time ti (the risk set R(ti)). Taking the log of the likelihood
in 2.26 gives:
logL(β) =D∑i=1
(p∑
k=1
βkZ(i)k
)−
D∑i=1
ln
∑j∈R(ti)
exp
(p∑
k=1
βkZjk
) (2.27)
The maximum likelihood (or partial maximum likelihood in this case) estimates can
be found by maximizing this equation. This is done by solving the derivative of 2.27
(known as the score function) with respect to the β. The score function is given by Cox
(1972) and presented in our notation by Klein and Moeschberger (2005, ch. 8.1):
Uh(β) =∂ logL(β)
∂βh=
D∑i=1
Z(i)h −D∑i=1
∑j∈R(ti)
Zjh exp
(p∑
k=1
βkZjk
)∑
j∈R(ti)exp
(p∑
k=1
βkZjk
) (2.28)
The estimates themselves are then found by solving Uh(β) = 0 for each h = 1, . . . , p.
Calculating these estimates can be done numerically using some kind of iterative process,
such as the Newton-Raphson metode, but in this thesis we will rely on the results that
are given to us by our software.
2. THEORY 21
In the next section we will look at ways to test the global hypothesis regarding the
estimations and in those tests we will need the information matrix. This matrix is the
negative of the 2nd derivative of Equation: 2.27. The information matrix will be de-
noted by I(β) which will be a p × p matrix where the (g, h)th element is: (Klein and
Moeschberger, 2005, ch. 8.1)
Igh(β) =D∑i=1
∑j∈R(ti)
ZjgZjh exp
(p∑
k=1βkZjk
)∑
j∈R(ti)exp
(p∑
k=1βkZjk
)
−D∑i=1
∑j∈R(ti)
Zjg exp
(p∑
k=1βkZjk
)∑
j∈R(ti)exp
(p∑
k=1βkZjk
)
×D∑i=1
∑j∈R(ti)
Zjh exp
(p∑
k=1βkZjk
)∑
j∈R(ti)exp
(p∑
k=1βkZjk
)
(2.29)
Ties
Estimating the β values using the Partial Maximum Likelihood method described above
is fine as long as there are no ties in the survival time. If any two events occur at the same
point in time then the estimation of β must be adjusted. There are a number of methods
to handle and set up the partial maximum likelihood when ties are present and Klein
and Moeschberger (2005, ch. 8.4) outline three classical methods, the Breslow, Efron and
Cox or Discrete methods.
Allison (2010, ch. 5) describes the Discreate method and another method in depth,
the Exact method. The Discrete method assumes that time is discrete hence if two events
happen at the same time there is no underlying order, both events really happened at
the same time. In most cases this is highly unlikely and tied events are often due to the
fact that we can not measure time exact enough. The Exact method assumes that there
is an underlying order in the events but since we can only observe event at time intervals
we can not determine which of the events occurred first if two events are recorded at the
same time point, ti. It is this exact method that will be relevant in our case since our
time data will be on a daily interval, thus two events can happen on the same day but
we will not be able to tell which occurred earlier in the day. We will shortly go trough
the theory of the Exact method next.
2. THEORY 22
The partial maximum likelihood derivation of β described in the previous section
needed all events to be individually ordered in Equation: (2.26) thus ties will pose a
problem when using this method. As explained the exact method assumes that ties are
due to inexact measurement of time and that there are a true underlying order. Using
this assumption and some basic probability theory the Exact method can estimate the
likelihood at those points in time where ties are present using all possible ways of ordering
the events. If we assume that 3 events are tied at the 5th point in time where events are
observed, t5, then there are 3! = 6 different ways these events could have been ordered
in reality. If we let each of these 6 set of orders be denoted by Ai and since each Ai is
mutually exclusive the sum of all Ai will be the union of these events. The likelihood at
the 5th point in time will then be L5 =6∑i=1
Pr(Ai).
Global test
The global hypothesis that we will be concerned with is H0 : β = β0 vs H1 : β 6= β0.
What we test is if our model with the covariates is significantly different from a reduced
model. This reduced model β0 can be seen as the model without the variables of interest
while β represent the model including the variables we want to test. β is called the full
model. (Kleinbaum and Klein, 2012, p. 103)
We will let b = (b1, b2, b3, . . . , bp)′ represent the estimated coefficients of β derived
using the maximum likelihood method presented in the previous section. There are three
different tests that we will look at in this section and use later on in our analysis in Sec-
tion: 4.2. All of these use this hypothesis but with slightly different test statistics. The
three tests presented below are those that will be reported by SAS when we run our model.
The Wald test uses the fact that b has a p-variate normal distribution for large
samples with mean β and variance-covariance matrix I−1(b). The test statistic is then χ2
distributed with p degrees of freedom where the test statistic is: (Klein and Moeschberger,
2005, p. 254)
χ2W = (b− β0)tI(b)(b− β0) (2.30)
2. THEORY 23
The Likelihood Ratio test is also χ2 distributed with p degrees of freedom for large
samples of n but uses the following test statistic: Klein and Moeschberger (2005, p. 254)
χ2LR = 2[logL(b)− logL(β0)] (2.31)
In this test statistic logL(b) represent Equation: 2.27 using the estimated values of
β while logL(β0) is Equation: 2.27 with the reduced model. According to Kleinbaum
and Klein (2012, p. 104) the Likelihood ratio test has better statistical properties than
the Wald test but in general, at least for large samples, they produce fairly similar test
statistics and rejections.
The last test that we will look at is the Score test which as the name suggests uses
the score function. In this test U(β) = (U1(β), . . . , Up(β))t and U(β) is asymptotically
p-variate normal with mean 0 and covariance matrix I(β). The test statistic is χ2
distributed with p degrees of freedom for large samples and is on the following form:
Klein and Moeschberger (2005, ch. 8.3)
χ2S = U(β0)
tI−1(β0)U(β0) (2.32)
In Section: (4.2) we will mainly focus on the Likelihood Ratio statistics due to the
better statistical properties but all of these statistics will be presented when a model has
been estimated. We will also perform some testes on a subset of variables from the model
and in those cases only the Wald statistic since it is readily available in SAS for these tests.
Example: In Table: (2.2) the result from an fitted Cox model is presented. In this
model all available covariates are used to estimate the survival of our components. All
covariates are significant with the largest P-value being 0.0155. Since the hazard ratio
for Usage is smaller than 1, 0.815, each extra step of 1 in the usage lowers the hazard
of breakage with a ratio of 0.815. A higher grade awarded in the tensile strength test
increases the hazard of the event occurring (lower test score is better). Factor1 is a
dummy for the Factor covariate and reference is Factor2. This means that the estimated
hazard ratio for Factor1 is the hazard for Factor1 components in comparison to those
from Factor2. In comparison to the result in Table: (2.1) we now have significant effect
from the Factor variable.
2. THEORY 24
Table 2.2: Estimated Cox model using example Data
Variable DF Parameter Standard Chi-Square Pr> Hazard Lower 95% Hazard Ratio Upper 95% Hazard Ratio
Estimate Error Chi-Square Ratio Confidence Limit Confidence Limit
Usage 1 -0.20406 0.05470 13.9180 0.0002* 0.815 0.729 0.905
Grade 1 0.08821 0.01872 22.2083 < 0.0001* 1.092 1.054 1.135
Factory1 1 -1.138955 0.47061 5.8573 0.0155* 0.320 0.122 0.783
∗=significant on the 5% level
The global test of the fitted Cox model is presented in Table: (2.3) and from the
results we can deduct that the model itself is significant since all three test statistics have
a P-value of less than 0.0001. In this table we can also se the Akaike information criterion
which we will look closer at in the next section.
Table 2.3: Tests of global hypothesis, β = 0, using example data
Test Chi-Square DF Pr> Chi-Square
Likelihood Ratio 41.9147 3 < 0.0001*
Score 40.3968 3 < 0.0001*
Wald 31.3107 3 < 0.0001*
Akaike information criterion (AIC) 134.695
∗=significant on the 5% level
2. THEORY 25
2.1.7 Assumptions, Goodness of Fit and Diagnostics
In this section we will look at methods to check the vital proportional hazards assump-
tions and evaluate how good the model is. We will begin by looking at two methods of
checking for proportional hazards then we will look at some information criteria before
we will examine some residuals.
Test of Proportional Hazards
Although we have not considered time dependent variables so far due to the fact that we
will not have any such variables in this thesis there is still a usage of them that we will
mention. Klein and Moeschberger (2005, ch. 9.2) describe how time dependent covariates
can be used to test the critical assumption of proportional hazards for our covariates that
is needed when modeling with the Cox proportional hazards model. In order to test if a
covariate, Z1, that is not time dependent violates the proportional hazards assumption
we first create a new covariate using Z1 which is artificially time dependent. Let this new
variable be Z2(t) = Z1 × g(t) where g(t) is a function of time such as g(t) = ln t. The
hazard rate at time t (Equation: 2.19) would be,
h(t|Z1) = h0(t) exp[β1Z1 + β2(Z1 × g(t))] (2.33)
and the hazard ratio between two individuals with different values on Z1 is
h[t|Z1]
h[t|Z∗1 ]= exp[β1(Z1 − Z∗1) + β2g(t)(Z1 − Z∗1)]. (2.34)
It is clear from this hazard ratio that it will only be dependent on the time through
g(t) if β2 6= 0. Thus by testing the hypothesis H0 β2 = 0 we will test if the proportional
hazards assumption holds for the covariate in mind. A rejection of H0 would mean that
the assumption is violated. What this means in practice is that we look at the proba-
bility of the estimated parameter for the new time dependent variable when estimating
the model. If that parameter is significant (different from zero) then the proportional
hazards assumption does not hold for that variable. In our case we will also test the as-
sumption on dummy variables and since all dummies have to be tested together a linear
test assuming that all the created time dependent dummy variables are zero together will
2. THEORY 26
be used.
Persson and Khamis (2008) present and test the statistical properties and powers for
a number of g(t) choices such as√t, et and ln(t). They found that both
√t and ln(t)
are good choices and since ln(t) also is the choice that Klein and Moeschberger (2005,
ch. 9.2) propose we will use g(t)=ln(t) when we test the proportional hazards assumption
in Section: (4.2).
There are different solutions to use if the proportional hazard assumptions fails when
testing it in this manner. One solution is to include the created time dependent variable in
the estimated model however that variable will be hart to interpret. For further discussion
on this se e.g. Klein and Moeschberger (2005, ch. 9.2).
Graphical Evaluation of Proportional Hazards, Arjas Plot
One way to graphically check for the proportional hazards assumption is trough the Ar-
jas plot first presented by Arjas (1988). The Arjas plot is not limited to checking the
assumption of proportional hazards, it can also be used to check the overall fit of a pro-
portional hazards regression model such as the Cox model. Let us assume that Z∗ is a
set of covariates included in the model and we are considering adding a new covariate, Z1.
Using the Arjas plot we can evaluate if Z1 should be included and if Z1 has proportional
hazards adjusted for the existing covariates. (Klein and Moeschberger, 2005, ch. 11.4)
There exists a number of methods to check for proportional hazards graphically as
described by Persson and Khamis (2007) in their comparison of different methods. They
recommend that the Arjas plot/method should be the preferred method of assessing
the proportional hazards assumption graphically except for the special case where the
hazard is strictly increasing. Thus this method will be used together with the test of pro-
portional hazard using a created function of time variable (described in previous section).
Klein and Moeschberger (2005, ch. 11.4) outline what is needed in order to produce
the Arjas plot and first we will need to estimate the proportional hazards model using
the covariate set Z∗. If Z1 is continuous we will need to group it into K levels making
2. THEORY 27
it discrete. For each of these levels (or the existing levels of a categorical variable) and
at every event time, ti, the Total time On Test (TOT) is calculated from the estimated
cumulative hazard rate of the model and the total number of observed events up to that
time point, N . The calculations of these two statistics are given next:
TOTg(ti) =∑
Z1, j=g
H(min(ti, Tj)|Z∗j) (2.35)
Ng(ti) =∑
Z1, j=g
δjI(Tj ≤ ti) (2.36)
What this means is that TOTg(ti) is the sum of the estimated cumulative hazards
for the the model using the covariate set Z∗ for all individuals in the group g where
g = 1, . . . , K up to time ti, or the largest event time in the group Tj. Ng(ti) is simply the
number of experienced events in the same group up to the same point in time. If Z1 is
redundant and not needed for the model’s fit a plot of Ng versus TOTg, the Arjas plot,
will result in a 45◦ line trough the origin. If we plot Ng versus TOTg for all the groups
this will result in K lines and if they are linear but differ from the 45◦ line Z1 should
be included in the model. Finally if the lines produced are not linear it indicates a vi-
olation of the proportional hazards assumption. (Klein and Moeschberger, 2005, ch. 11.4)
Example: We will look at an example of the Arjas plot using our example data. In
Figure: (2.4) the Factor variable is checked for proportional hazards. Since it consist of
Factory1 and Factory2 this grouping is insinuative to use in the Arjas test and looking
at Figure: (2.4) nether of the two factories seam to have proportional hazards, both lines
are curving.
2. THEORY 28
Figure 2.4: Arjas plot of the example data by Factory
Comparing Models using Information Criteria
In order to evaluate if one model is better than another, to evaluate the goodness of fit,
we can use so called information criteria. These information criteria only give an informal
indication of which model has the best fit, the difference between two criteria can not be
tested . (Allison, 2010, p. 74)
All three of these criterions use the log-likelihood, in specific they use −2 × logL or
−2× [Equation : (2.27)]. While you can use just −2× logL as a goodness of fit statistic
we will focus on three variations of this statistic which adjust for the number of covariates
in the respective model. To be specific they penalise a model with more covariates. Below
these statistics are presented as Allison (2010, p. 74-74) present them (remember that we
have a total of p covariates and n observations in our notation):
2. THEORY 29
AIC = −2 logL+ 2k
BIC = −2 logL+ k log n
AICC = −2 logL+ 2k + 2k(k+1)n−k1
(2.37)
Akaike’s information criterion (AIC) and Bayesian information criterion (BIC) both
penalise for additional covariates where (BIC) in most applications penalise the most.
The Akaike’s information criterion corrected (AICC) is a slight alteration of the (AIC)
statistic which takes the number of observations into account and thus may behave better
in small samples. (Allison, 2010, p. 74-75) 1 When we estimate our models in Section:
(4.2) we will use the AIC value to compare models.
Residuals
Martingale residuals
While we might know which covariates we want to use in our Cox proportional haz-
ards model we might be uncertain what form of that variable would explain its effect
on survival best. It might be that, Z2, logZ or some other transformation explains the
variable’s contribution to survival better than the plain form Z. Further on we will have
continuous variables in this thesis and it might be appropriate to discretisize one or more
of those variables in order to better estimate their influence on the model. A modification
of the Cox-Snell residuals called Martingale residuals can be used to find an appropriate
functional form of a variate and whether it should be discretisized. The Martingale resid-
ual, M , for right-censored data and time independent covariates is defined in Equation:
2.38. It has the property ofn∑j=1
Mj = 0 and for large samples each Mj is uncorrelated
with population mean zero. These residuals can be seen as the difference between ob-
served number of events, δj, and the expected number of events, rj. Thus the Martingale
residuals represent the excess number of events in the data that is not predicted by the
Cox proportional hazard model. (Klein and Moeschberger, 2005, ch. 11.3)
Mj = δj − rj (2.38)
1Important to remember is that when comparing the goodness of fit statistics for different models
you can not compare (AIC) for one model with (BIC) for another and so on. Only the same criterion
can be compared between different models.
2. THEORY 30
The method for finding an appropriate functional form for the variable of interest is
to exclude that variable when estimating the Cox proportional hazards model and then
calculate the Martingale residuals. Let Z1 represent the residual of interest. Next we
plot the residuals Mj against Z1 for each jth observation. Usually a smoothed fit of the
diagram is used to get a clear sense of the “linearity” in the scatter plot through time.
It is this fitted curve that gives us an indication of what functional form of Z1 should
be used. The aim is to get a fitted (smoothed) curve that is linear. If the line contains
a clear break in it when Z1 is continuous this indicates that discretisizing Z1 whould be
apropriate. (Klein and Moeschberger, 2005, ch. 11.3)
Example: In the example data we have a variable called Grade. This variable could
illustrate a test score given to the component in a tensile strength test and thus it might
tell if a component will break. In Figure: (2.5) the Martingale residuals for Grade are
plotted with a fitted curve. There is a clear break in this fitted curve which indicates
that the Grade variable which is continuous should be discretized. Locking at the figure
suggests that all test scores below 60-65 could be grouped into one group and all above
into another.
Figure 2.5: Plot of the Martingale residuals for the Grade variable in the example data
2. THEORY 31
Cox-Snell residuals
While AIC, BIC and AICC can be used to compare the goodness of fit for different
models they do not tell if a specific model has a good fit. One way of testing if an estimated
Cox proportional hazards model has good fit is through the Cox-Snell residuals. Lee and
Wang (2003, ch. 8.4) present the Cox-Snell residuals as,
rj = − log S(tj) (2.39)
where tj is the observed survival time, censored or uncensored for individual j. S(t) is
the estimated survival function based on the estimated covariate. If the observation at
tj is censored then the corresponding rj will be treated as an censored observation. This
means that when plotting these residuals they will be represented by a step like line just
as the survival function and cumulative hazards function.
If the residuals of a fitted Cox proportional hazards model are good, that is if the
model fits the data, then the residuals should line up on a 45◦ line when plotted versus
the estimated cumulative hazard rate for these residuals. If a model fits the dataset
then the Cox-Snell residuals will follow the unit exponential distribution. If we let SR(r)
represent the Kaplan-Meier estimates with respect to the residuals then − log SR will be
the estimated cumulative hazard and for each individual − log SR(rj) = rj if the model
is appropriate. (Lee and Wang, 2003, ch. 8.4)
While the Cox-Snell residuals are useful when determining the over all goodnes of fit
for a model Klein and Moeschberger (2005, p. 358) point out that this method gives no
indicator why a model does not fit well in such cases. Further on assuming exponential
distribution for the residuals only truly holds when the actual covariate values are used
rather than the estimated values and since we want to check the goodness of fit for an
estimated model these values will be estimated. Thus departure from the exponential
distribution (which will be observed as departure from the 45◦ line in the figure) can be
due to uncertainty in the estimations of β and the cumulative hazard. This effect will be
largest in the right tail of the distribution (and in the right end of the figure) for small
samples. (Klein and Moeschberger, 2005, p. 359)
2. THEORY 32
Example: In Table: (2.2) we estimated a Cox model using our example data. Now we
will look at the goodness of fit for that model using the Cox-Snell residuals. In Figure:
(2.6) the Cox-Snell plot is presented and the residuals deviates somewhat from the 45◦
line. This model seams to not fit the data perfectly.
Figure 2.6: Plot of Cox-Snell residuals for the estimated model (all covariates) of the
example data
2. THEORY 33
2.2 Key Performance Indicators
There is a very large amount of Key Performance Indicators (KPI:s) or financial ratios
used to analyse all sorts of aspects of a company’s financial health and well being. Using
these an investor can decide if he find the company worthy of investment. To include
them all would be, close to, if not impossible. In this thesis we will rather focus on two
kinds of KPI:s aimed specifically towards the companies’ ability to pay off loans.
Penman (2010, ch. 19) defines two types of ratios that are of importance when
analysing a company’s ability to pay off its loans and avoid defaulting (going bankrupt
and being terminated or sold in order to give back what is possible to the lenders/banks).
These two main types of ratios are Liquidity Ratio and Solvency Ratio.
2.2.1 Liquidity
The liquidity Ratio is concerned with the short term papers (loans and debts that will
have to be repaid within one year). As this might imply liquidity gives an indication of
how well a company will succeed with paying of loans and dept that are due in a near
future. It means that these ratios are constructed using assets (things that can/will be
turned into cash) and liabilities (debts and loans) that are, for assets, going to result in
cash within a year, and for liabilities, are coming due within a year. To differ these assets
and liabilities from those with a longer time frame they are called current. (Penman,
2010, ch. 19)
So why is this of interest? Brealey et al. (2011, ch. 28.7) makes the comparison to a
household and a typical financial situation you can find yourself in. If you for some reason
are facing a large unexpected bill you need to use capital easily accessible in order to pay
it quickly. Savings and capital invested in stocks are examples of quick, easy money. But
if those do not suffice then you might have trouble with realizing assets such as a car or
a vacation house into cash quickly enough to meat the bill. In the same way companies
have assets that are easy to realize as well as those that will take a significant amount of
time before they are turned into cash.
2. THEORY 34
There are in total six different types of liquidity measures presented by Penman (2010,
ch. 19) and we will focus on those that qre also presented by Brealey et al. (2011, ch. 28.7).
Those that indicate the ability of a company’s current assets to pay for the current
liabilities. The three that we will not focus on are concerned with how well different
types of cash flow can cover liabilities and expenditures.
Current Ratio =Current Assets
Current Liabilities(2.40)
Quick (or Acid Test) Ratio =Cash + Short-Term Investments + Receivables
Current Liabilities2 (2.41)
Cash Ratio =Cash + Short-Term Investments
Current Liabilities(2.42)
These three ratios in general explain the same thing but there are slight differences
between them in the numerator. The Current Ratio is the most general one based on all
current assets while the Quick Ratio excludes Inventories due to the fact that these are
a bit slower to turn into cash. Cash Ratio as the name suggests only takes into account
the Cash and those investment that can be liquified almost immediately. (Penman, 2010,
ch. 19)
One thing to keep in mind is that not all companies are the same and since they have
different businesses what might be considered a good current ratio for one company could
be bad for another in the eyes of an investor. A current ratio of 1:1 could be good for a
company which quickly sells and restocks its inventory while for a manufacturing com-
pany with a slow process from inventory to cash a ratio of 2:1 might just be acceptable.
This slowness from inventory to cash might also bee seen as an overoptimistic view of
the situation and hence the quick ratio could bee seen as harsher since it excludes the
inventories. (Melville, 2011, p. 364)
2Receivables is debt that other companies’ or customers own the company
2. THEORY 35
Later we will use the Quick Ratio as an good middle ground between three kind of
ratios to represent liquidity in our study
2.2.2 Solvency
While liquidity gives the investors a picture of the short-term situation investors might be
interested in how well a company is equipped to pay off long-term debts. Thus investors
look at Solvency Ratios in order to estimate a companies ability to cover debts in a more
distant future. (Penman, 2010, ch. 19)
Below are presented three different types of solvency ratios that are of interest. There
are some differences between them but in general they are rather alike. As before these
three ratios are gathered from Penman (2010, ch. 19).
Debt to Total Assets =Total Debt
Total Assets3 (2.43)
Debt to Equity4 =Total Debt
Total Equity(2.44)
Long-Term Debt Ratio =Long-Term Debt
Long-Term Debt + Total Equity(2.45)
It is interesting to note what Brealey et al. (2011, ch. 28.6) mention regarding these
ratios. They use the book (or accounting) value of a company, the equity, instead of
the market value. While in a default situation the market value is what determines if
debt holders get their money back it also includes things that the investors assume to be
positive for the future value of the shares. Assets such as research and development are
included in the mark value during good times but these values might disappear if times
become bad, thus the market value is often ignored by lenders.
The Debt to Equity ratio will be used as a solvency ratio later on which in fact is
closely related to another KPI commonly used in Sweden, the Solidity (Soliditet).
3Total Assets=Liabilities + Total Equity4Known in Sweden as Skuldsattningsgrad
Data
In this part of the thesis we will focus on the data set gathered in order to answer our
questions from Section: (1.2). We will present a summary of how the dataset and the
variables were acquired and how the observations (company stocks) were selected. We
will also look closer at how our event is defined.
3.1 The Observations
In Section: (1.2) it was briefly mentioned that this study would span the year of 2008 and
contain stocks listed on the main Swedish stock market, the OMX Stockholm exchange
market. Thus an initial list of stocks traded on the first day of trade, January 2nd, 2008
were gathered from the newspaper Dagens Industri (DI). The January 3rd edition was
used since it contains the closing prices of all stocks traded in Sweden during the first day
of trade. The actual time series data were then collected using the software Datastream
Professional from Thomson Reuter. Due to missing information/data a total of 293 stocks
from the original list of 301 companies and their daily closing prices during 2008 were
gathered from Datastream. In Appendix: (7.3) a list of all 293 stocks is presented.
3.2 The Variables
Apart from the necessary event time variable that will be presented in Section (3.3) a
number of covariates will be used in our study. What sector a company operated in will
of course be a vital variable since the primary interest in this thesis is the effect on stock
price attributed to the sectors during 2008. During 2008 stocks in Sweden were typically
divided into 9 different sectors and the information about which sector a particular com-
pany operated in were gathered from the January 2nd edition of DI.
Another variable that is also collected from DI is company size. It is logical to include
size in this study since the size of a company very well could contribute to how well it
survives a financial crisis like the one in 2008. By DI, and in general, companies are
36
3. DATA 37
sorted into three size groups, Small Cap, Mid Cap and Large Cap. What size a company
is sorted to is determined by the accumulated value of its stocks (number of stocks ×
price of one stock). If the stock value of a company is larger than 1 billion euro it is
sorted to Large Cap. If the value is between 150 million euro and 1 billion euro it is a
Mid Cap company and finally companies with less that 150 million in accumulated stock
value is regarded to be a Small Cap company.
In Section: (2.2) we have discussed the usage of Liquidity and Solvency ratios to eval-
uate the capacity off a company to pay of its loans. We will use these two measurements
or ratios in our model in order to further, apart from size, adjust for the fact that not all
companies are the same.
These ratios are presented by the companies for each fiscal year (12 month interval
of financial reporting, typically Jan-Dec) and presented in the annual report presenting
figures for that year. This means that, in general, last years ratios will be available 3-5
months into the new year. When our study starts in january 2008, harshly speaking, it
was only the KPI:s from the financial report regarding the year 2006 (presented during
the spring of 2007) that were available to the traders. This would imply that it is the
Liquidity and Solvency ratios from 2006 that should be used in the study since these are
the latest figures available at the beginning of the study. However we vill use the KPI:s
that were presented in the financial reports for 2007, thus available to the traders during
the spring of 2008, rather than those from 2006 in order to get figures that are a bit more
up to date.
The two ratios that will be used in this study are the Quick Ratio as an indicator of
Liquidity and Debt to Equity as a solvency ratio. As with the time-series data for each
stock these KPI:s are gathered from Datastream Professional where unfortunatly some
missing values regarding debt, liability and assets resulted in a total of 201 observation
with information about both their Quick Ratio and Debt to Equity. As we will see in Sec-
tion: (4.2) this will mean that when including these variables will have less observations
in the study.
3. DATA 38
3.3 An Event
In order to find the time to event for each of our stocks we need to define what an event
is. Since we are interested in finding differences that determine how well a company can
withstand a recession the death or termination of a company would be a suitable event.
However since few of the companies traded in the beginning of 2008 were terminated
during the year this would lead to few events. Thus this thesis will use the same method
as Ni (2009) where the event is defined as a specific amount of fall in a stock’s price.
Looking at the OMX Stockholm share price index, which is an index compiled of the
prices for all stocks traded on the OMX Stockholm exchange, we can determine that the
index lost almost 50% of its value. In fact the index was at its all year high notation on
the first day of trade, january 2nd. Using that as the index base (value 100) its lowest
notation of 50.18 were measured on the 21st of November. Since the index is a weight of
all companies some will have a higher value at a specific point in time and some will have
a lower value than the index at that point. This means that although some companies
also fell to 50% of their initial value during this year others will not have done so.
In this study we will work with two types of event times. One where the study starts
at the beginning of the year for all stocks and ends on the last day of trade. This would be
Type I censoring as described in Section: (2.1.3). Since the price index fell to a minimum
of 50% of its initial value at the beginning of the study the arbitrary point of a price fall
to below 60% of the initial value of the stock will be regarded as an event. This point is
chosen such that most observations (stocks) will have experienced the event and avoid a
huge amount of censored observations (choosing the event to be a price fall to 50% of the
initial stock value whould result in more censored cases). To separate this time to event
from the next one we will refer to this as ’Day One’ start.
The second time to event variable that will be used is individual for each stock. This
time to event is going to use the same window in time for the study however the study of
each observation will start at its respective year high notation. That is for some stocks
the time to event might be counted from the first day of the study, january 2nd, while
other stocks will have their highest notation later in the year. In either case the study
3. DATA 39
will end at the end of the year. Thus we are still are working with right censoring but
in this case Type I generalized censoring. Once again we will use the price index to find
a suitable threshold for an event. Since the index has its highest notation on the first
day of trade in the study the maximum price fall from the highest notation and the first
notation in the year will be the same. This means that we can use a price fall to below
60% of each stock’s highest notation as an event as well. This method is the one used by
Ni (2009) and this will be referred to as individual time to event.
A difference that is noteworthy between these two methods is that in the first case,
’Day One’, all censored observations (except for random censoring events) will be at the
same time point (the maximum possible study time, one year or 261 days of trade). The
censoring time for observations that have not experienced the event will differ when we
use individual start since each observation is going to be observed for a different amount
of time.
A third method of defining the start of our study could be to determine the start
of the financial crisis of 2008. One such definition of start time could be the Lehman
Brothers collapse of 2008 however other starts could be defined as well. Attempting to
of define the start of the 2008 financial crisis is left out of this study and we will only use
the two time to event variables presented above.
In this dataset there are some cases where we can not follow the stock’s price to the
end of the study while the event still has not occurred at the point where the observation
is lost. Thus there are some cases where stocks are lost to random censoring (se Section:
(2.1.3)). As explained these observations can have experienced a number of things such
as merges, buy ups or termination of trade which have resulted in the stock no longer
being traded. For both types of time to event data , ’Day One’and individual, there are a
total of seven cases were the stocks have been lost to the study. These have been censored
at that time.
Results
Before we look at the different variables and their effect on the survival of the stocks we
will take a general look at the estimated survival function and cumulative hazards unad-
justed for any covariates. As explained in the previous section we will use two different
ways of measuring the survival time, the individual start where each stock is observed
after their respective highest notation and the ’Day One’ start where all stocks are ob-
served from the 2nd of January.
4.1 Survival Functions and Cumulative Hazards
First let’s look at the estimated survival function for the two different event variables using
the K-M or Product-Limit estimator. This method was outlined in Section: (2.1.4). The
estimated survival curves are presented in Figures: (4.1) and (4.2) below.
Figure 4.1: Estimated survival finction with individual start
40
4. RESULTS 41
Figure 4.2: Estimated survival finction with ’Day One’ start
Looking at Figure: (4.1), the one with individual start, the events are rather evenly
distributed throughout the study. However in the ’Day One’ figure, Figure: (4.2), the
events are mostly clumped together , specifically around the day 200. Since all stocks
have the same starting day we can deduct that day 200 represents the 6th of October
which, if we look back to Figure: (1.1) is around the time where the different indices have
the highest price losses. Another difference is the censored cases. With individual start
we have 35 censored cases with very different event times. In the ’Day One’ study there
are 63 censored cases and all except seven of these are censored at the last day of the
study, day 261. The seven that are not censored at day 261 are those observations that
are lost due to competing risks.
Looking at the cumulative hazards functions in Figures: (4.3) and (4.4) we see that
the same patterns as we observed for the survival functions are present for the cumulative
hazards. When we observe each stock with individual start the cumulative hazard rises
more evenly than for the ’Day One’ method. As discussed in Section: (2.1.4) the Nelson-
Aalen estimation is used to estimator these cumulative hazards.
4. RESULTS 42
Figure 4.3: Estimated cumulative hazard with individual start
Figure 4.4: Estimated cumulative hazard with ’Day One’ start
4. RESULTS 43
4.1.1 By Sectors
Now we will group our two ordinal variables, Sector and Size and re-estimate the survival
functions and the cumulative hazards. We will start with our main variable of interest,
the Sector variable1. For the survival functions the pattern from the previous section
where we noticed that the events are more evenly spread over time in the individual
start study continues when we have grouped sectors but there are differences between the
different sectors. The survival functions are presented in Figures: (4.5) and (4.6). One
sector that appears to have low survival time in both time methods is the Energy sector
however if we have a look at the Log-Rank and Wilcoxon test statistics in Tables: (4.1)
and (4.2) these statistics are simultaneously significant only for the ’Day One’ method
on the 5% level. While the Log-Rank test, which as we described in Section: (2.1.5) puts
equal weight over all time points is not significant for the individual method (P-value
0.1032). This could indicate that the differences between at least one of the sectors and
the rest are clearer when we use the ’Day One’ time to event. Since the Wilcoxon test
is significant (P-value 0.0234) in Tables: (4.1) this indicates that for the individual start
data there are differences between the sector in an early stage. The cumulative hazards
figures of this stratification are found in Appendix: (7.2).
1In Appendix:(7.1) some descriptive statistics are presented for this grouping.
4. RESULTS 44
Figure 4.5: Estimated survival function stratified by sector, individual start
Figure 4.6: Estimated survival function stratified by sector, ’Day One’ start
4. RESULTS 45
Table 4.1: Tests of equality stratifyed by sector, individual start
Test Chi-Square DF Pr> Chi-Square
Log-Rank 13.2618 8 0.1032
Wilcoxon 17.7207 8 0.0234*
∗=significant on the 5% level
Table 4.2: Tests of equality stratifyed by sector, ’Day One’ start
Test Chi-Square DF Pr> Chi-Square
Log-Rank 25.3026 8 0.0014*
Wilcoxon 27.7024 8 0.0005*
∗=significant on the 5% level
4. RESULTS 46
4.1.2 By Size
Although size is not the main variable of interest in our study the variable is easily strat-
ified into the three groups and we will quickly have a look at the survival function for our
data grouped by size 2. The estimated survival functions are presented in Figures:(4.7)
and (4.8).
Figure 4.7: Estimated survival finction stratified by size, individual start
Figure 4.8: Estimated survival finction stratified by size, ’Day One’ start
2In Appendix: (7.1) some descriptive statistics are presented
4. RESULTS 47
It is clear from Figure: (4.7) that there is very little difference between the differ-
ent sized companies’ survival time when we use the individual start. However when we
follow every company from the first day of trade it looks like there are some differences
between the sizes. In Figure: (4.8) the survival function for Small Cap stocks are above
those for Mid and Large Cap for most of the time. Looking at the test statistics for
equality between the different sizes in Tables: (4.3) and (4.4) we find that no test is sig-
nificant (far from) for the individual time but both the Log-Rank and Wilcoxon tests are
highly significant for the ’Day One’ time to event data with P-values 0.0002 and less than
0.0001. The cumulative hazards figures of this stratification are found in Appendix: (7.2).
Table 4.3: Tests of equality stratifyed by size, individual start
Test Chi-Square DF Pr> Chi-Square
Log-Rank 1.5523 2 0.4602
Wilcoxon 0.3442 2 0.8419
Table 4.4: Tests of equality stratifyed by size, ’Day One’ start
Test Chi-Square DF Pr> Chi-Square
Log-Rank 16.7187 2 0.0002*
Wilcoxon 19.0375 2 < 0.0001*
∗=significant on the 5% level
4. RESULTS 48
4.2 Cox Modeling
We will now move on from evaluating the survival functions to looking at some estimated
Cox proportional hazards models. Since ties are present we will use the Exact method
from here on as outlined in Section: (2.1.6) and in general a significance level of 5% will
be used to determine significances.
4.2.1 Individual Study Start
First we will use the time to event data that uses individual start and since our main
interest is to find significant effects in the sector dummies we will begin with estimating
the model using only the sector variable. In Table: (4.5) the significance of this variable
is tested. It is a global test where H0 : βMaterial = βIndustry = . . . = βTele = 0 thus it tests
if the “Sector” itself is significant and if, as in Table: (4.5), this test is not significant we
should not estimate the model using the sector as a covariate. In this test we will use the
Wald test since it is the one that SAS gives. In Section: (2.1.6) we mentioned that the
Likelihood Ratio statistic have better statistical properties for low samples but the Wald
statistic should suffice for our purpose. With a P-value of 0.1201 the Sector variable is
not significant. The global test itself is also insignificant as seen in Table: (4.6) where the
P-value for the Likelihood Ratio is 0.1395. In this case since Sector is the only variable in
the model the Global test is the same as the test of the Sector significance (Same Wald
Chi-Square value).
Table 4.5: Tests of variable significance, idividual start
Variables Wald Chi-Square DF Pr> Chi-Square
Sector 12.7671 8 0.1201
4. RESULTS 49
Table 4.6: Tests of global hypothesis, β = 0, individual start, Sector
Test Chi-Square DF Pr> Chi-Square
Likelihood Ratio 12.2726 8 0.1395
Score 13.2064 8 0.1049
Wald 12.7671 8 0.1201
In Table: (4.7) the Size variable is included in the model however this does not im-
prove the model. This time both the Size and Sector variable fail their individual global
tests with p-values of 0.117 and 0.488. In Table: (4.8) we can also see that the model in
general is insignificant with a p-value of 0.1882 for the Likelihood Ratio.
Table 4.7: Tests of variable significanse, idividual start
Variables Wald Chi-Square DF Pr> Chi-Square
Sector 12.8479 8 0.1172
Size 1.4337 2 0.4883
Table 4.8: Tests of global hypothesis, β = 0, individual start, Sector and Size
Test Chi-Square DF Pr> Chi-Square
Likelihood Ratio 13.6782 10 0.1882
Score 14.6315 10 0.1461
Wald 14.1759 10 0.1651
Adjusting for companies’ different KPI values by including the Quick Ratios and Debt
to Equity ratios from the financial year of 2007 into the model in Table: (4.10) results
in yet another insignificant model. The whole model is insignificant with P-value 0.1325
for the Likelihood Ratio test. Also in Table: (4.9) none of the variables are significant.
Sector has a P-value of 0.14 which is far from the P-value of 0.05 that is our threshold
for significance. At this point we will abandon a model using the individual start data
4. RESULTS 50
and move our focus to the time to event data were all stocks are observed from the first
day of trade in our study.
Table 4.9: Tests of variable significance, idividual start
Variables Wald Chi-Square DF Pr> Chi-Square
Sector 12.2571 8 0.1401
Size 1.1709 2 0.5569
QR 07 1.3094 1 0.2525
D E 07 1.0549 1 0.3044
Table 4.10: Tests of global hypothesis, β = 0 individual start, Sector, Size and KPI:s
Test Chi-Square DF Pr> Chi-Square
Likelihood Ratio 17.4770 12 0.1325
Score 16.4837 12 0.1701
Wald 15.8733 12 0.1971
4.2.2 Day One Study Start
As in the previous section we will start by estimating a Cox proportional hazards model
using only the Sector variable. In Table: (4.11) we test the significance of the Sector
variable and find that it is significant (p-value=0.0025) which means that which means
that it would be valuable to include the sector dummies in the model. The SAS output
from this estimation is shown in Table: (4.12). The sector dummies are not significant
but if we were to interpret their estimated Hazard Ratios we would find that all sectors
except for the Commodity and Health sectors have higher hazard of experiencing the
event than the base sector. In this case (and throughout) the reference sector is Energy.
Since the Sector variable was significant in Table: (4.11) we can conclude that there is
a simultaneous effect from the Sectors but it is too weak to give any significant sector
dummies. In Section: (4.1.1) we suspected that the Material sector had among the worst
survivals and alltough not significant the hasard ratio for Material in Table: (4.12) is
4. RESULTS 51
the second highest which is in line with that suspision. In Table: (4.13) we see that the
model itself is significant (Likelihood Ratio P-value 0.0011) which is as expected since
Sector were significant and it is the only variable included. The AIC value of this model
is 1927 as presented in Table: (4.13).
Table 4.11: Tests of variable significance, ’Day One’ start
Variables Wald Chi-Square DF Pr> Chi-Square
Sector 23.7271 8 0.0025*
∗=significant on the 5% level
Table 4.12: Estimated Cox model using ’Day One’ start with Sector as covariate (repre-
sented by dummies, reference sector is Energy)
Variable DF Parameter Standard Chi-Square Pr> Hazard Lower 95% Hazard Ratio Upper 95% Hazard Ratio
Estimate Error Chi-Square Ratio Confidence Limit Confidence Limit
Commodity 1 -0.45377 0.60568 0.5613 0.4537 0.635 0.183 2.110
Finance 1 0.08029 0.43313 0.0344 0.8529 1.084 0.502 2.828
Health 1 -0.74116 0.49404 2.2506 0.1336 0.477 0.188 1.357
IT 1 0.00124 0.43533 0.0000 0.9977 1.001 0.461 2.621
Industry 1 0.18257 0.42798 0.1820 0.6697 1.200 0.563 3.109
Material 1 0.59217 0.48345 1.5003 0.2206 1.808 0.734 5.073
Rare Commodity 1 0.55293 0.44401 1.5508 0.2130 1.738 0.783 4.609
Tele 1 0.82165 0.60858 1.8228 0.1770 2.274 0.652 7.592
Table 4.13: Tests of global hypothesis, β = 0, ’Day One’ start, Sector
Test Chi-Square DF Pr> Chi-Square
Likelihood Ratio 25.8037 8 0.0011*
Score 25.2700 8 0.0014*
Wald 23.7271 8 0.0025*
Akaike information criterion (AIC) 1927
∗=significant on the 5% level
4. RESULTS 52
When we studied the survival function of this data stratified by size in Figure: (4.8)
and Table: (7.4) we found that this stratification was highly significant and that the
one survival curve that stood out was the on for Small Cap companies. If we look at
Table: (4.14) we can see that both Sector (P-value=0.0145) and Size (P-value=0.0026)
are significant. This means that we can include the dummies in the model. Including the
size dummies in our Cox model in Table: (4.15) below the trend continues. The dummy
for Small Cap companies is highly significant with a P-value of 0.0016. The estimated
Hazard Ratio for Small Cap is 0.570 (Large Cap is the reference) which means that the
hazard of experiencing the event is almost half for Small Cap companies in comparison
to Large Cap companies. No other variable in the Cox model is significant but as we can
see in Table: (4.16) the model as a whole is highly significant with P-values of 0.0001 and
less than 0.0001 for the different test statistics. The AIC value of this model has gone
down from 1927 to 1909 in this model (Table: (4.16)).
Table 4.14: Tests of variable significance, ’Day One’ start
Variables Wald Chi-Square DF Pr> Chi-Square
Sector 19.0697 8 0.0145*
Size 11.9238 2 0.0026*
∗=significant on the 5% level
Table 4.15: Estimated Cox model using ’Day One’ start with Sector and Size as covariate
(represented by dummies, reference sectors are Energy and Larg Cap)
Variable DF Parameter Standard Chi-Square Pr> Hazard Lower 95% Hazard Ratio Upper 95% Hazard Ratio
Estimate Error Chi-Square Ratio Confidence Limit Confidence Limit
Commodity 1 -0.44101 0.60846 0.5253 0.4686 0.643 0.184 2.148
Finance 1 0.17605 0.44007 0.1600 0.6891 1.192 0.543 3.145
Health 1 -0.47106 0.50346 0.8754 0.3495 0.624 0.242 1.806
IT 1 0.35851 0.44969 0.6356 0.4253 1.431 0.637 3.830
Industry 1 0.33156 0.43179 0.5896 0.4426 1.393 0.648 3.629
Material 1 0.58308 0.48874 1.4233 0.2329 1.792 0.719 5.070
Rare Commodity 1 0.67495 0.44780 2.2718 0.1317 1.964 0.877 5.237
Tele 1 0.86611 0.62125 1.9436 0.1633 2.378 0.667 8.131
Mid Cap 1 -0.07074 0.17387 0.1655 0.6841 0.932 0.662 1.311
Small Cap 1 -0.56165 0.17807 9.9482 0.0016* 0.570 0.402 0.809
∗=significant on the 5% level
4. RESULTS 53
Table 4.16: Tests of global hypothesis, β = 0, ’Day One’ start, Sector and Size
Test Chi-Square DF Pr> Chi-Square
Likelihood Ratio 38.1300 10 < 0.0001*
Score 37.4240 10 < 0.0001*
Wald 35.5295 10 0.0001*
Akaike information criterion (AIC) 1909
∗=significant on the 5% level
Finally we will include the KPI:s into the model in order to adjust for the difference
in financial status in the companies. Table: (4.17) shows that when including the KPI:s
Sector and Size are still significant which means that we again can move on with estimat-
ing a Cox model including the dummies for Sector and Size together with the KPI:s. This
model’s output is presented in Table: (4.18). Neither of the added continuous variables
are significant (Quick Ratio P-value is 0.2821 and Debt to Equity P-value is 0.9351). The
Debt to Equity value in fact has an estimated Hazard Ratio of 1 which means that a
one point increase or decrease has no impact on the survival in this model. Again the
global test is highly significant which we can see in Table:(4.19) were the Likelihood Ratio
P-value is less than 0.0001. The AIC value of 1262 in Table: (4.19) is the best for any of
the models using ’Day One’ data and the available variables which means that this model
will be selected as the ”best” model, even though the KPI:s don’t have a significant effect.
3
3Since there weere some missing values in the collection of these KPI:s this model is estimated using
201 observations with 38 censored cases.
4. RESULTS 54
Table 4.17: Tests of variable significanse, ’Day One’ start
Variables Wald Chi-Square DF Pr> Chi-Square
Sector 19.6852 8 0.0116*
Size 14.9865 2 0.0006*
Qick Ratio 1.1569 1 0.2821
Debt to Equity 0.0066 1 0.9351
∗=significant on the 5% level
Table 4.18: Estimated Cox model using ’Day One’ start with Sector, Size KPI:s as
covariate (Sector and Size are represented by dummies, reference sectors are Energy and
Larg Cap)
Variable DF Parameter Standard Chi-Square Pr> Hazard Lower 95% Hazard Ratio Upper 95% Hazard Ratio
Estimate Error Chi-Square Ratio Confidence Limit Confidence Limit
Commodity 1 -0.19476 0.73496 0.0702 0.7910 0.823 0.184 3.632
Finance 1 0.44098 0.60601 0.5295 0.4668 1.554 0.505 5.797
Health 1 -0.47210 0.62681 0.5673 0.4513 0.624 0.191 2.391
IT 1 0.71883 0.54566 1.7354 0.1877 2.052 0.785 7.045
Industry 1 0.39899 0.52148 0.5854 0.4442 1.490 0.606 4.945
Material 1 0.48143 0.58875 0.6687 0.4135 1.618 0.550 5.892
Rare Commodity 1 1.02829 0.54297 3.5865 0.0583 2.796 1.075 9.559
Tele 1 0.89225 0.73077 1.4908 0.2221 2.441 0.554 10.770
Mid Cap 1 -0.07209 0.21328 0.1142 0.7354 0.930 0.613 1.417
Small Cap 1 -0.77934 0.22589 11.9035 0.0006* 0.459 0.294 0.715
Quick Ratio 1 -0.10458 0.09723 1.1569 0.2821 0.901 0.727 1.070
Debt to Equity 1 0.0000552 0.0006771 0.0066 0.9351 1.000 0.999 1.001
∗=significant on the 5% level
Table 4.19: Tests of global hypothesis, β = 0, ’Day One’ start, Sector, Size and KPI:s
Test Chi-Square DF Pr> Chi-Square
Likelihood Ratio 42.8679 12 < 0.0001*
Score 39.6337 12 < 0.0001*
Wald 37.0490 12 0.0002*
Akaike information criterion (AIC) 1262
∗=significant on the 5% level
4. RESULTS 55
Residuals and Assumptions
Now we know that our final model will use the ’Day One’ data set and we have seen
that the best model has all available variables included. The sector and size variables
are dummies and we can’t do much about them but the two KPI:s included in the
model are on a continuous scale and might either be better suited as a discrete variables
or transformed in some form in order to explain their effect on survival the best. As
explained in Section: (2.1.7) the Martingale residuals can be used to find a good form
for these covariates. We will first plot the Martingale residuals for the two KPI:s with
no transformation and evaluate their form. These plots are shown in Figures: (4.9) and
(4.10) below.
Figure 4.9: Martingale residuals for the Quick Ratio
4. RESULTS 56
Figure 4.10: Martingale residuals for the Debt to Equity
None of these figures have a straight fitted line which we are looking for but they
also have no clear breaking points suggesting points that could be used to make them
discrete (in comparison to the example in Figure: (2.5)). There are different ways to
transform a variable such as squaring it in order to make the fitted line straighter and
thus improving the variable’s influence in the Cox model. In our case since we have some
large observations regarding the ratios but mostly lower values we will test transforming
the variables using the logarithm. By taking the logarithm of these variables we will
decrease the influence of large values and make the difference larger between the smaller
values. This might improve our martingale plots, make the fitted line straighter, and we
will test it next. The resulting Martingale plots are shown in Figure: (4.11) and (4.12),
both variables have improved slightly. The fitted curves are still not perfectly straight
but they do look much smoother this time and we will use these transformations of the
two KPI variables.
4. RESULTS 57
Figure 4.11: Martingale residuals for log(Quick Ratio)
Figure 4.12: Martingale residuals for log(Debt to Equity)
4. RESULTS 58
Next we will check if the assumption of proportional hazards is valid. There were two
methods of checking the assumption in Section: (2.1.7). Since we will keep the Quick
Ratio and Debt to Equity measurements continuous we will not use the Arjas plot to
check the proportional hazard assumption for these variables but we will use it on our
categorical variables.
In the Arjas plot we want the lines for each group to be straight with a slope different
from 1 as explained in Section: (2.1.7). In Figure: (4.13) we can see that some of the
lines are not straight. Especially the line for Industry which starts off with a larger slope
than the 45◦ however it curves downwards as the number of failures accumulate and in
the end it is below the 45◦ line. The lines for Rare Commodities and material have the
same tendencies to curve in the end of each respective line.
Figure 4.13: Arjas plot for Sector
In Figure: (4.14) the Arjas plot for the Size groups are presented and these look much
better than those in Figure: (4.13). There is a slight upwards curve on the Mid Cap line
but in general all three groups look very good.
4. RESULTS 59
Figure 4.14: Arjas plot for Size
Next we will create an artificially time dependent variable of the variables we are
testing and look at the estimated parameters for that variable to evaluate if the assump-
tion holds as outlined in Section: (2.1.7). Since Sector and Size are two sets of dummy
variables all dummies in the sets will have to be tested together and in those cases it is
the hypothesis that all estimated parameters for the time dependent covariates have a
value of zero that we will evaluate just as when we test if the variables should be included
in the Cox model.
Table 4.20: Test of proportional hazards for each of the covariates
Tested Variable Chi-Square DF PR>Chi-Square
Sector 12.914 7 0.074
Size 1.755 1 0.185
log(Quick Ratio) 0.043 1 0.836
log(Debt to Equity) 1.003 1 0.317
4. RESULTS 60
Table: (4.20) consist of the results from four estimated Cox models each one estimated
with only one covariate and its artificial time dependent variable. In Table: (4.20) we
can se that although the sector variable set is close to rejection all test are not rejected
on the 5% level. This result matches the one we found from the Arjas plot. In Figure:
(4.13) some of the groups were not perfectly straight however the proportional hazards
assumption is not rejected for any of the included variables in our study and there is no
need to adjust our model.
4.2.3 Final Model
Having concluded that the proportional hazards assumption is not rejected for any of the
covariates and having deducted on what form we want the continuous KPI variables we
can now model the final Cox proportional hazards model. Checking in Table: (4.21) it is
clear that although we have changed the KPI variables slightly by taking the logarithm
of them the categorical variables are still significant (P-value of Sector is 0.0153 and Size
is 0.0008). The estimated Cox model output is presented in Table: (4.22) where still
the only variable that is individually significant is the Small Cap dummy with P-value of
0.0011. The Sector variable with lowest P-value (0.0773) is Rare Commodities. Although
not significant it appears that the Rare Commodities sector has the worst survival with a
hazard ratio of 2.621, the Health sector has the lowest hazard rate (0.669) and seems to
have the best survival. The transformation of the Debt to Equity has not made a huge
difference on the estimated Hazard Ratio. In Table: (4.23) we can see that all global tests
reported are highly significant with P-values of 0.0001 or less. The AIC value 1260 for
this model is slightly better than for the full model with the untransformed KPI variables
(Table: (4.23)).
4. RESULTS 61
Table 4.21: Tests of variable significanse, Final Model
Variables Wald Chi-Square DF Pr> Chi-Square
Sector 18.9253 8 0.0153*
Size 14.2516 2 0.0008*
log(Quick Ratio) 2.3325 1 0.1267
log(Debt to Equity) 0.6711 1 0.4127
∗=significant on the 5% level
Table 4.22: Estimated Cox model using ’Day One’ start with the final set of covariates
Variable DF Parameter Standard Chi-Square Pr> Hazard Lower 95% Hazard Ratio Upper 95% Hazard Ratio
Estimate Error Chi-Square Ratio Confidence Limit Confidence Limit
Commodity 1 -0.28487 0.68822 0.1713 0.6789 0.752 0.193 3.128
Finance 1 0.46301 0.60650 0.5828 0.4452 1.589 0.515 5.929
Health 1 -0.40215 0.62720 0.4111 0.5214 0.669 0.204 2.566
IT 1 0.80682 0.54844 2.1642 0.1413 2.241 0.852 7.726
Industry 1 0.40552 0.52124 0.6053 0.4366 1.500 0.610 4.976
Material 1 0.45956 0.58806 0.6107 0.4345 1.583 0.539 5.757
Rare Commodity 1 0.96359 0.54553 3.1200 0.0773 2.621 1.001 8.993
Tele 1 0.93390 0.73115 1.6315 0.2015 2.544 0.577 11.235
Mid Cap 1 -0.01327 0.21558 0.0038 0.9509 0.987 0.647 1.510
Small Cap 1 -0.73423 0.22518 10.6313 0.0011* 0.480 0.308 0.746
log(Quick Ratio) 1 -0.25932 0.16980 2.3325 0.1267 0.772 0.553 1.075
log(Debt to Equity) 1 0.06412 0.07826 0.6711 0.4127 1.066 0.917 1.247
∗=significant on the 5% level
Table 4.23: Tests of global hypothesis, β = 0, Final Model
Test Chi-Square DF Pr> Chi-Square
Likelihood Ratio 45.6193 12 < 0.0001*
Score 41.6097 12 < 0.0001*
Wald 39.0325 12 0.0001*
Akaike information criterion (AIC) 1260
∗=significant on the 5% level
4. RESULTS 62
Now we have ended up with a model that is highly significant and we will end with
a quick evaluation of how good this model fits the data. In Section: (2.1.7) we described
how the Cox-Snell residuals can be used to evaluate the goodness of fit for an estimated
model. When plotting the Cox-Snell residuals against the estimated cumulative hazard
rates of the model we want them to line up on a 45◦ line. Remember that the goodness
of fit “test” is strongest in the left end of the plot since it contains more observations
(residuals). The Cox-Snell plot is shown in Figure: (4.15).
Figure 4.15: Cox-Snell residuals plotted against cumulative hazards of he final model
The models seems to fit the data rather well. Initially the residuals follow the 45◦ line
quite closely only to deviate from it in the top-right part where the numbers of residuals
are fewer.
Summary and Conclusion
Throughout this thesis we have looked at the theory behind survival analysis and the
Cox proportional hazards model. We applied this theory on Time to Event data from
the Swedish stock market and the financial crisis of 2008. Our aim was to answer the
questions: What is the contribution from a company’s sector with regards to its survival
of a financial crisis? with the sub question Can we use survival analysis on financial
data to answer this?.
Beginning with the sub question we have found that using the Cox proportional haz-
ards model on financial data is viable. What you need to make sure is that the dependent
variable is on the Time to Event form meaning that it should be a measurement of time
from a starting point to the time at which a specific event occurs. If your dependent
variable is on this form and you aim to find covariates that effect the time for the event
to occur then this method should work well. In Table: (4.23) we could see that our final
model was highly significant and from looking at Figure: (4.15) we can deduct that our
estimated model seems to have rather good fit.
In contradiction to the result that Ni (2009) obtained our KPI variables were insignif-
icant however this does not mean that KPI:s are irrelevant to this kind of study. Survival
analysis could very well be used successfully to deduct the effect of different financial
ratios on the development of stock prices. Remember that the event does not need to be
the loss of stock price.
Moving on to our main question the results are divided. In Table: (4.22) none of
the dummies regarding different sectors were significant. This gives no support for a
contribution to stocks’ survival during 2008. However in Table: (4.21) the simultaneous
test of the Sector variable is significant. This indicates that there in fact is a significant
difference between the sectors during this time period (the model is fitted towards the
event data following all stocks from the first day of trade). However the power of the
Sector variable is not strong enough to make the individual dummies significant but still
63
5. SUMMARY AND CONCLUSION 64
there was an effect from the different sectors on the survival of a stock during 2008.
A variable included in the model that we put no theoretical background to except
that it would help sort for different types of companies was the Size variable. In our final
model we found that the Small Cap companies had a better survival time in comparison
to the Large Cap companies. We have not looked at any theory regarding if small or large
companies survive financial crises best thus we will not say if this result was expected or
not but it is interesting to notice that Small Cap companies have a Hazard Rate of 0.48
(Table: (4.22)) when compared to Large Cap companies.
I would recommend using the survival analysis method when working with financial
data and there are a lot of possible implementations of this method. Possible future
studies could move the focus from company sector to size in order to find why the Small
Cap variable is significant. Another possible application and further study is to do as Ni
(2009) who focused on the financial ratios. This study could also be tweaked in regards
to the time to event variables. We mentioned briefly in Section: (3.3) that the start of
the study could be chosen to be the start of the crisis itself. Defining the actual start of
the 2008 crisis and use it as the start of a time to event study was left out in this thesis
but could very well be part of future research which could yield different results.
Bibliography
O. Aalen. Nonparametric inference for a Family of Counting Processes. The Annals of
Statistics, 6(4):701–726, July 1978.
P. D. Allison. Survival Analysis Using SAS; A Practical Guide. SAS Institute Inc, Cary,
NC, second edition, 2010.
E. Arjas. A Graphical Method for Assessing Goodness of Fit in Cox’s Proportional
Hazards Model. JJournal of the American Statistical Association, 83(401):204–212,
1988.
R. A. Brealey, S. C. Myers, and F. Allen. Principles of Coporate Finance. McGraw-
Hill/Irwin, New York, New York, tenth edition, 2011.
D. R. Cox. Regression Models and Life-Tables. Journal of the Royal Statistical Society.
Series B (Methodological), 32(2):187–220, 1972.
E. L. Kaplan and P. Meier. Nonparametric Estimation from Incomplete Observations.
Journal of the American Statistical Association, 53(282), 1958.
C. P. Kindleberger and R. Z. Aliber. A History of Financial Crises. John Wiley & Sons,
Hoboken, New Jersey, fifth edition, 2005.
J. P. Klein and M. L. Moeschberger. Survival Analysis; Techniques for Cansored and
Truncated Data. Springer, New York, New York, second edition, 2005.
D. G. Kleinbaum and M. Klein. Survival Analysis; A Self-Learning Text. Springer, New
York, New York, third edition, 2012.
E. T. Lee and J. W. Wang. Statistical Methods for Survival Data Analysis. John Wiley
& Sons, Hoboken, New Jersey, third edition, 2003.
A. Melville. International Financial Reportiong. Pearson Education Limited, Essex,
England, third edition, 2011.
65
6. BIBLIOGRAPHY 0
W. Nelson. Theory and Applications of Hazard Plotting for Censored Failure Data.
Technometrics, 14(4):945–966, November 1972.
J. Ni. Application of Cox Proportional Hazard Model to the Stock Exchange Market.
B.S. Undergraduate Mathematics Exchange, 6(1), 2009.
S. H. Penman. Financial Statement Analysis and Security Valuation. McGraw-Hill/Irwin,
New York, New York, fourth edition, 2010.
I. Persson and H. Khamis. A Comparison of Graphical Methods for Assessing the Pro-
portional Hazards Assumption in the Cox Model. Journal of Statistics & Applications,
2(1-2):1–132, 2007.
I. Persson and H. Khamis. A Comparison of Statistical Tests for Assessing the Propor-
tional Hazards Assumption in the Cox Model. Journal of Statistics & Applications, 3
(1-2):135–154, 2008.
Appendix
7.1 Descriptive Statistics for stratums
Table 7.1: Number of observed stocks, events, and censored cases by sector, individual
start
Sector Total Events Censored
Commodity 8 6 2
Energy 8 6 2
Financials 58 51 7
Health Care 27 21 6
IT 59 52 7
Industry 73 67 6
Materials 17 15 2
Rare Commodity 38 35 3
T/CM 5 5 0
Total 293 258 35
1
7. APPENDIX 2
Table 7.2: Number of observed stocks, events, and censored cases by sector, ’Day One’
start
Sector Total Events Censored
Commodity 8 5 3
Energy 8 6 2
Financials 58 48 10
Health Care 27 13 14
IT 59 44 15
Industry 73 61 12
Materials 17 15 2
Rare Commodity 38 33 5
T/CM 5 5 0
Total 293 230 63
Table 7.3: Number of observed stocks, events, and censored cases by size, individual start
Size Total Events Censored
Large Cap 88 78 10
Mid Cap 83 75 8
Small Cap 122 105 17
Total 293 258 35
Table 7.4: Number of observed stocks, events, and censored cases by size, ’Day One’ start
Size Total Events Censored
Large Cap 88 74 14
Mid Cap 83 70 13
Small Cap 122 86 36
Total 293 230 63
7. APPENDIX 3
7.2 Cummulative Hazards
Figure 7.1: Estimated cumulative hazard stratified by sector, individual start
Figure 7.2: Estimated cumulative hazard stratified by sector, ’Day One’ start
7. APPENDIX 4
Figure 7.3: Estimated cumulative hazard stratified by size, individual start
Figure 7.4: Estimated cumulative hazard stratified by size, ’Day One’ start
7. APPENDIX 5
7.3 Companies
In the following table all included companies/stocks in this study is presented. The names
are those used in 2014 and all financial data are collected from Datastream Professional
by Thomson Reuters.
7. APPENDIX 6
Table 7.5: Stocks included in the study
ASTRAZENECA (OME) ELECTROLUX ’A’ LUNDIN PETROLEUM SAGAX
A-COM ELECTROLUX ’B’ LUXONEN SDB SAGAX PREFERRED
AARHUSKARLSHAMN ELEKTA ’B’ MALMBERGS ELEKTRISKA ’B’ SANDVIK
ABB (OME) ELOS ’B’ MEDA ’A’ SAS
ACANDO ’B’ ENEA MEDIVIR ’B’ SCA ’A’
ACAP INVEST ’A’ ENIRO MEKONOMEN SCA ’B’
ACAP INVEST ’B’ ERICSSON ’A’ MELKER SCHORLING SCANIA ’A’
ACTIVE BIOTECH ERICSSON ’B’ METRO INTL.SDB ’A’ SCANIA ’B’
ADDNODE ’B’ FABEGE METRO INTL.SDB ’B’ SEB ’A’
ADDTECH ’B’ FAGERHULT MICRONIC MYDATA SEB ’C’
AEROCRINE ’B’ FAST PARTNER MIDSONA ’A’ SECO TOOLS ’B’
AF ’B’ FASTIGHETS BALDER ’B’ MIDSONA ’B’ SECTRA ’B’
AFFARSSTRATEGERNA ’B’ FAZER KONFEKTYR SERVICE MIDWAY HOLDINGS ’A’ SECURITAS ’B’
ALFA LAVAL FEELGOOD SVENSKA MIDWAY HOLDINGS ’B’ SEMCON
ALLENEX FENIX OUTDOOR ’B’ MILLICOM INTL.CELU.SDR SENSYS TRAFFIC
ALLIANCE OIL SDR (OTC) FINGERPRINT CARDS ’B’ MODERN TIMES GP.MTG ’A’ SIGMA B
ANOTO GROUP G & L BEIJER MODERN TIMES GP.MTG ’B’ SINTERCAST
ARTIMPLANT GETINGE MODUL 1 DATA SKANDITEK INDRI.FRV.
ASPIRO GEVEKO ’B’ MSC KONSULT ’B’ SKANSKA ’B’
ASSA ABLOY ’B’ GUNNEBO MULTIQ INTERNATIONAL SKF ’A’
ATLAS COPCO ’A’ GUNNEBO INDUSTRIER MUNTERS SKF ’B’
ATLAS COPCO ’B’ HALDEX NCC ’A’ SKISTAR ’B’
ATRIUM LJUNGBERG ’B’ HAVSFRUN INVESTMENT ’B’ NCC ’B’ SOFTRONIC ’B’
AUDIODEV ’B’ HEBA ’B’ NEDERMAN HOLDING SSAB ’A’
AUTOLIV SDB HEMTEX NEONET SSAB ’B’
AVANZA BANK HOLDING HENNES & MAURITZ ’B’ NET INSIGHT ’B’ STOCKWIK FORVALTNING
AXFOOD HEXAGON ’B’ NETONNET STORA ENSO ’A’
AXIS HIQ INTERNATIONAL NEW WAVE GROUP ’B’ STORA ENSO ’R’
B&B TOOLS ’B’ HL DISPLAY ’B’ NIBE INDUSTRIER ’B’ STUDSVIK
BALLINGSLOV INTL. HMS NETWORKS NILORNGRUPPEN ’B’ SVEDBERGS I DALSTORP ’B’
BE GROUP HOGANAS ’B’ NOBEL BIOCARE (OME) SVENSKA HANDBKN.’A’
BEIJER ALMA ’B’ HOLMEN ’A’ NOBIA SVENSKA HANDBKN.’B’
BEIJER ELECTRONICS HOLMEN ’B’ NOKIA (OME) SVITHOID TANKERS ’B’
BERGS TIMBER ’B’ HOME PROPERTIES NOLATO ’B’ SVOLDER ’A’
BETSSON ’B’ HQ NORDEA BANK SVOLDER ’B’
BILIA ’A’ HUFVUDSTADEN ’A’ NORDNET ’B’ SWECO ’A’
BILLERUD KORSNAS HUFVUDSTADEN ’C’ NOTE SWECO ’B’
BIOGAIA ’B’ HUMAN CARE H C NOVACAST TECHS.’B’ SWEDBANK ’A’
BIOINVENT INTL. HUSQVARNA ’A’ NOVESTRA SWEDISH MATCH
BIOLIN SCIENTIFIC HUSQVARNA ’B’ NOVOTEK ’B’ SWEDISH ORPHAN BIOVITRUM
BIOPHAUSIA ’A’ I A R SYSTEMS GROUP OEM INTERNATIONAL ’B’ SYSTEMAIR
BIOTAGE IBS ’B’ OPCON TANGANYIKA OIL SDB
BJORN BORG ICA GRUPPEN ORC GROUP TECHNOLOGY NEXUS
BOLIDEN IMAGE SYSTEMS ORESUND INVESTMENT TELE2 ’A’
BONG INDL.& FINL.SYS.’A’ OREXO TELE2 ’B’
BORAS WAFVERI ’B’ INDL.& FINL.SYS.’B’ ORIFLAME COSMETICS SDR TELECA ’B’
BOSS MEDIA INDUSTRIVARDEN ’A’ ORTIVUS ’A’ TELIASONERA
BRINOVA FASTIGHETER INDUSTRIVARDEN ’C’ ORTIVUS ’B’ TELIGENT
BRIO ’B’ INDUTRADE OXIGENE (OME) TICKET TRAVEL
BROSTROM INTELLECTA ’B’ PA RESOURCES ’B’ TIETO CORPORATION (OME)
BTS GROUP INTRUM JUSTITIA PARTNERTECH TILGIN
BURE EQUITY INVESTOR ’A’ PEAB ’B’ TRACTION ’B’
CARDO INVESTOR ’B’ PEAB INDUSTRI ’B’ TRADEDOUBLER
CARL LAMM JEEVES INFO.SYSTEMS PHONERA TRANSCOM WWD.SDB.A
CASHGUARD ’B’ JM POOLIA ’B’ TRANSCOM WWD.SDB.B
CASTELLUM KABE HUSVAGNAR ’B’ PRECISE BIOMETRICS TRELLEBORG ’B’
CATELLA ’A’ KAPPAHL PREVAS ’B’ TRICORONA
CATELLA ’B’ KARO BIO PRICER ’B’ UNIBET GROUP SDB
CATENA KAUPTHING BANK PROACT IT GROUP UNIFLEX ’B’
CISION KINNEVIK ’A’ PROBI VBG GROUP
CLAS OHLSON ’B’ KINNEVIK ’B’ PROFFICE ’B’ VENUE RETAIL GROUP ’B’
CONCORDIA MARITIME ’B’ KLOVERN PROFILGRUPPEN ’B’ VITROLIFE
CONNECTA KNOW IT Q-MED VOLVO ’A’
CONSILIUM ’B’ KUNGSLEDEN RATOS ’A’ VOLVO ’B’
CTT SYSTEMS LAGERCRANTZ GROUP ’B’ RATOS ’B’ VOSTOK GAS SDB
CYBERCOM GROUP EUROPE LAMMHULTS DESIGN GROUP RAYSEARCH LABS.’B’ VOSTOK NAFTA INV.SDR
DAGON LATOUR INVESTMENT ’A’ READSOFT ’B’ WALLENSTAM ’B’
DIN BOSTAD SVERIGE LATOUR INVESTMENT ’B’ REDERI AB TNSAT.’B’ WIHLBORGS FASTIGHETER
DIOS FASTIGHETER LAWSON SOFTWARE REJLERS B XANO INDUSTRI ’B’
DORO LBI INTERNATIONAL REZIDOR HOTEL GROUP XPONCARD
DUNI LEDSTIERNAN ’B’ RNB RETAIL AND BRANDS ZODIAK TELEVISION ’B’
DUROC ’B’ LINDAB INTERNATIONAL RORVIK TIMBER
EAST CAPITAL EXPLORER LUNDBERGFORETAGEN ’B’ ROTTNEROS
ELANDERS ’B’ LUNDIN MINING SDB SAAB ’B’
7. APPENDIX 7
7.4 Exampel Data
In Table: (7.6) the example dataset that is used to create examples is presented. This
dataset is generated in order to give good examples and is in no way made up of real
data. We can pretend that we are studying how long a mechanical component last before
it breaks thus the individuals will be components. Time to Event and Event is the time
the component have been studied before it brakes (Event=1 and time is in months).
If the component have not broken before time 30 it is time censored. There are two
different Factories that produces the component, Factory1 and Factory2. We also know
how much the component is used, Usage could be the amount of weight that is put on
the component or the number of times it is used. Finally we have a Grade variable which
could be the result from a tensile strength test on the batch from which the component
comes from (lower is better).
7. APPENDIX 8
Tab
le7.
6:E
xam
ple
Dat
a
Indiv
idual
Tim
eto
Event
Event
Fact
ory
Usa
ge
Gra
de
Indiv
idual
Tim
eto
Event
Event
Fact
ory
Usa
ge
Gra
de
14
1F
acto
ry2
1464
2610
1F
acto
ry2
1580
222
1F
acto
ry2
2282
278
1F
acto
ry2
1780
315
1F
acto
ry1
787
2811
1F
acto
ry2
968
414
0F
acto
ry1
1665
2921
1F
acto
ry2
1159
59
1F
acto
ry2
766
3019
1F
acto
ry2
1769
613
1F
acto
ry2
1166
3130
0F
acto
ry1
2048
717
1F
acto
ry2
1792
3230
0F
acto
ry1
1357
819
1F
acto
ry1
1380
3330
0F
acto
ry2
1558
910
0F
acto
ry2
1169
3430
0F
acto
ry2
2354
1023
0F
acto
ry1
1569
3530
0F
acto
ry1
1158
117
1F
acto
ry2
1464
3630
0F
acto
ry2
1848
1224
1F
acto
ry2
2282
3730
0F
acto
ry2
2255
1315
1F
acto
ry1
787
3830
0F
acto
ry1
1740
1414
1F
acto
ry1
1776
3930
0F
acto
ry1
1849
1512
1F
acto
ry1
1580
4030
0F
acto
ry2
1659
169
1F
acto
ry2
867
4130
0F
acto
ry2
1754
173
1F
acto
ry2
169
4230
0F
acto
ry1
2157
1817
1F
acto
ry2
1186
4330
0F
acto
ry2
1757
1925
1F
acto
ry1
1769
4430
0F
acto
ry2
1258
2013
1F
acto
ry2
1266
4530
0F
acto
ry2
2054
2118
0F
acto
ry1
1775
4630
0F
acto
ry1
1758
2216
0F
acto
ry1
1364
4730
0F
acto
ry2
1554
2320
1F
acto
ry1
1587
4830
0F
acto
ry2
2154
2416
1F
acto
ry1
1677
4930
0F
acto
ry2
1753
2518
1F
acto
ry1
1378
5030
0F
acto
ry2
1249