Prediction of occupational accident statistics and work time...

Journal of Loss Prevention in the Process Industries, 25(3), pp. 467-477 (May 2012) doi:10.1016/j.jlp.2011.11.014 Prediction of occupational accident statistics and work time loss distributions using Bayesian analysis Eftychia C. Marcoulaki*, Ioannis A. Papazoglou, Myrto Konstandinidou System Reliability and Industrial Safety Laboratory, National Centre for Scientific Research

“Demokritos”, PO Box 60228, 15310 Aghia Paraskevi, Greece

ABSTRACT

This paper uses Bayesian analysis tools for the stochastic evaluation of work time losses

due to occupational accidents in a workplace. Models are developed for accident

frequencies, duration of recovery from an accident, and the worker unavailability. The

unavailability statistics are hereby derived considering a two state stochastic model, to

provide estimates for the expected work time losses over a base period of workplace

operation. The above models are applied on real multiyear accident data collected from the

Greek Petrochemical Industry.

Keywords: Bayesian analysis, occupational accidents, work time loss, petrochemical

industry

* Corresponding author. Tel: +302106503743; fax: +302106545496.

E-mail addresses: [email protected] (Eftychia C. Marcoulaki),

[email protected] (Ioannis A. Papazoglou),

[email protected] (Myrto Konstandinidou)

Bayesian analysis for the estimation of work time loss distributions

page 2

1. INTRODUCTION

Health and safety at work is one of the most important advanced fields of the European

Union (EU) social policy (European Commission, 2011). Based on EU-15 statistics, there

are 4.3 million non-fatal occupational accidents resulting in more than three days absence

from work every year across the EU-15 (Eurostat, average 1998-2007), or around 146

million lost workdays. These accidents cost around 20 billion euro to the EU, and a

considerable share of the above costs falls upon social security systems and public finances

(Konkolewsky, 2001). Employers face costs linked to sick pay, replacement of absent

workers, training/administration and loss of productivity –only a part of these are covered by

insurance. For European workers, the total annual loss of income due to absence from work

is estimated at one billion euro (European Commission, 2011).

This paper presents a set of Bayesian tools, to model the work time lost on sick leave due

to occupational accidents occurring in an industrial workplace. A lot of research has been

done on the quantitative analysis of occupational accident data. This includes tools for

identifying causal factors related to different consequences of incidents (Bellamy et al.,

2007; Konstandinidou et al., 2011;etc), modeling occupational accident frequency (Meel at

al., 2007; Marcoulaki et al., 2011; etc), assessing occupational risk (Marhavilas et al., 2011)

etc. Data on work time losses are typically considered in analyses for accident severity

(Jacinto and Soares, 2008; Blanch et al., 2009; Carnero and Pedregal, 2010; etc).

Acknowledging the financial implications of occupational accidents, Parejo-Moscoso et al.

(2011) estimated costs related to sick-leave, and Jallon et al. (2011) provided data collection

criteria for the development of indirect accident cost calculation models. Ale et al. (2008)

constructed an exposure-based occupational risk model, to derive improvement measures

and support cost-effective risk reduction strategies.

Some of the above works relay on descriptive statistics and factor analysis, others make

use of advanced prediction methods. Bayesian inference methods have long been used for

the analysis of accident databases (Hora and Iman, 1990) to model uncertainties and enable

future predictions. None of the previous works considered sick leaves due to occupational

accidents, and the associated work time loss for the company, within a Bayesian perspective.

The tools presented herein include Bayesian models for the prediction of (a) the number

of accidents during a given time period, (b) the duration of the recovery from an accident,

and (c) the unavailability of labor force in the workplace. The developed models are updated


page 3

using available evidence from real accident data, to inform predictions of work time losses

and related probability density functions (pdf’s). The proposed framework can be useful in

investigating future trends in the workplace, and enabling a better management of the labor

force.

The paper is organized as follows. Section 2 presents the method of analysis and the

Bayesian models. Section 3 presents and discusses the numerical results using evidence from

real accident reports. Section 4 concludes this work.

2. BAYESIAN MODELS

This work applies Bayesian inference methods for the analysis of occupational accident

data, predict:

(i) the number of accidents expected over a given time period in a company

workplace

(ii) the duration of the recovery period following an accident

(iii) the amount of time that the workers are recovering from the accidents, and not

being available to perform their work.

The Bayesian approach has the following steps:

(i) the quantity x to be estimated is assumed to be a random variable, generated

according to a specific stochastic model. Let ( | )f x be the associated pdf and

be a vector of parameters for x,

(ii) if is not known, we quantify our prior belief about the true value of as a pdf

( ) ( | )g g , where is a vector of parameters of g,

(iii) we collect evidence E through observation of the stochastic process, e.g. regarding

worker accidents in a workplace. We then quantify the relative likelihood of

observing E as ( )L E (likelihood function).

(iv) we calculate the posterior pdf according to the general formulation of Bayes’

theorem ( ) ( ) ( ) ( )g E g L E L E g d . Since the integral at the

denominator is a constant function of θ, we get ( ) ( )g E g L E .

The posterior pdf provides an expression of the remaining uncertainty about the value of

, and consequently about ( | )f x . The more information we have, the less the uncertainty

about the true value of .


page 4

2.1. Problem description

Workers in chemical companies are often involved in occupational accidents. Some of

these accidents require several days leave before the worker recovers and can return to work.

Others (like a very minor injury) may be dealt within the company premises with negligible

loss of work time. The company pays wages for the total workdays, including the time lost

for accident recovery. It is therefore important to have a reliable estimation of the amount of

work time loss due to the occupational accidents, in total as well as per accident. The

following models aim to support such estimations, starting from the model of section 2.2 to

predict the expected number of accidents over a given period of company operation. Sections

2.3 and 2.4 present models for work time loss predictions, either per individual accident

regardless of the total work time, or over a given period of company operation, respectively.

Figure 1 gives a simple illustration of a workplace with N workers. In general, the set of

workers may include employees or contractors and the employment time is not necessarily

the same for all workers. The figure shows the timeline for each worker during a period of

observation starting at time TS and terminating at TF. Each worker, n = 1, 2, …, N, works

between TS,n TS and TF,n ≤ TF, and during this time the worker is involved in Kn accidents.

The staring times TS,1, TS,5, TS,6, TS,8 and TS,9 coincide with TS. The finishing times TF,1, TF,2,

TF,6, TF,8 and TF,9 coincide with TF. Only workers 1, 6, 8 and 9 are employed during the entire

period of observation, while the others work only for a fraction of this time. The working

time of worker n before accident kn = 1, 2, …, Kn, occurs is denoted by , .

nn kt The respective

recovery time is denoted by , .nn kr The time range between the recovery from the last

accident, Kn, and TF,n is denoted by sn. Worker 3 has no accidents, so s3 = TF,3 TS,3. The

contract of worker 4 expires when she/he recovers from her/his last accident, thus s4 = 0.

2.2. Number of accident occurrences over a period of time

The occurrence of an occupational accident during a time period is not certain, thus the

number of accidents occurring during the same period in a workplace is a random variable.

The first set of models considers the probability distribution of the number of accidents

occurring over a given time interval, τ. Let the times between successive accidents be

randomly distributed according to an Exponential pdf, with rate of accident occurrence λ.

The stochastic process generating the accidents is then a Poisson process, and the number of

accidents κτ occurring over a given time interval τ is distributed according to the pdf:


page 5

( | ) exp !f

(1)

where ρτ denotes the amount of time within τ that workers are recovering from accidents

(see section 2.4). In the present application, we expect that , thus the models for

accident occurrences assume , and equation (1) becomes:

( | ) exp !f

(2)

Equation (2) can be solved for given values of the rate parameter λ. If λ is not known with

certainty, the Bayesian approach consists in assuming that λ is a stochastic variable

distributed according to a known pdf ( | )g . It is assumed here that the λ prior is a

Gamma pdf, with shape parameter aλ and rate parameter bλ, i.e. | ,g f a b .

Then, integration of the unconditional pdf of on λ gives the following analytical solution:

11( | , )

aa b

f a bb b

(3)

If |g follows other pdfs, numerical integration can be applied. Note that, equation

(3) is a Negative Binomial pdf with success probability ( )b (Forbes et al., 2011).

Figure 2 presents prior distributions for κτ using the pairs of and a b values reported on

Table 1, where and a b denote the prior values of and a b , respectively. Pair L0 is the

Gamma(0.001, 0.001) pdf, used in practice as a prior for Poisson processes (Meel and

Seider, 2006; Lambert et al., 2005). Each of the other three pairs expresses different degrees

of prior belief for the ranges of possible values of κτ and their associated probabilities, and is

derived assuming that λ lies with probability 95% within a specified range (λ1, λ2). The time

base used here is τ=250,000 workdays, which represents the equivalent of 1000 employees

working 8 hours per day, 5 days per week, 50 weeks per year. The four priors of Figure 2

and Table 1 are used in the data analysis of section 3.

The prior knowledge encapsulated in ( ) | ,g g a b and ( | , )f a b can be

updated using evidence collected from system observation. The evidence, Eλ, consists here

of:

the set of times between successive accidents, , ,nn kt for the N workers, and

the set of times without accident, sn, between the recovery from the last accident of

worker n and TF,n.


page 6

The likelihood of the evidence is the probability of the joint event ,

1 1 1

n

n

n

KN N

n k n

n k n

t s

, so

,

1 1 1

( | ) ( | ) ( 0 )n

n n

n

KN N

n k s

n k n

L E f t f

. Assuming that all the company workers have

identical behavior in terms of generating accidents, thus , ,( | ) exp( )n nn k n kf t t and

( 0 ) exp( )ns nf s , the likelihood function is:

( | ) exp ( )KL E T R (4)

where T is the total work time during the observation period , ,

1

N

F n S n

n

T T T

, and R is

the total time loss due to occupational accidents ,

1 1

n

n

n

KN

n k

n k

R r

. Since the prior for λ is a

Gamma pdf and the Gamma pdf is the conjugate prior to the Poisson pdf, the posterior

distribution for λ is also in the family of Gamma pdfs. In effect:

1( ) and ex ( ) p

a Kb b bg T R a a K T R

(5)

where K is the total number of accidents 1

N

n

n

K K

.

So, when the accidents occur according to a Poisson process, the sufficient statistics for λ

are the total number of accidents, K, and the time difference T R.

The Negative Binomial pdf of equation (3) is also valid for the updated values of the λ

parameters, to derive posterior statistics for κτ. Assuming Gamma and Negative Binomial

pdfs for λ and κτ, respectively, their expected values and variances are formulated as (Forbes

et al., 2011):

2 and

a aE V

b b

(6)

2

( an

)d

a a bE V

b b

(7)

Using the model of equation (3) for posterior pdf calculations, and real data from reported

accidents in a workplace, section 3 provides posterior results on the predicted number of

occupational accidents. Section 3 also discusses how the choices of prior pdfs affect the

predictions.


page 7

2.3. Duration of the worker recovery from an accident

The second set of models gives the probability distribution of the time required to recover

from an accident. Let the time loss (in number of days), δ, be randomly distributed according

to an Exponential pdf, with rate of accident recovery μ:

( | ) exp( )f (8)

Since the exact value of parameter μ is not known, we quantify our lack of knowledge by

assuming that μ is randomly distributed following a known prior pdf ( ) ( | )g g .

When the μ prior is a Gamma pdf, i.e | ,g f a b , integration of the

unconditional on μ pdf of δ yields the following analytical solution:

( 1)( ) ( )

a af E b a b

(9)

Otherwise numerical integration can be applied.

Figure 3 presents prior distributions for δ using the pairs of and a b values reported on

Table 2, where and a b denote the prior values of and a b , respectively. Pair M0 is the

Gamma(0.001, 0.001) pdf. Τhe other three pairs express different degrees of prior belief for

the range of possible values of δ and their associated probabilities, and are derived assuming

that μ lies with given probability within a specified range (μ2, μ1). The priors of Figure 3 and

Table 2 are used in the data analysis of section 3.

The evidence Eμ, used here to derive posterior statistics for μ and δ, consists of the set of

recovery times, , ,nn kr for the K accidents, with kn = 1, 2, …, Kn and n = 1, 2, …, N. The

likelihood of Eμ is the probability of observing the joint event ,

1 1

n

n

n

KN

n k

n k

r

. Assuming that all

the company workers and the accidents they are involved in have identical behavior in terms

of generating accident recovery times, and that the recovery times are randomly distributed

according to an Exponential pdf, the likelihood function is:

1

( | ) k

Kr K R

k

f E e e

(10)

With respect to the assumptions behind equation (10), the sufficient statistics for μ are the

total number of accidents, K, and their total recovery time R. Since Gamma is a conjugate

prior to the Exponential distribution ( | )f , the posterior distribution for μ is also

according to a Gamma distribution:


page 8

1exp (( ) and )

a Kg b bR R ba a K

(11)

The δ model in equation (9) is also valid for the updated values of the μ parameters. It can

be shown that, the expected value and variance formulae for the μ and δ pdfs are:

2 and

a aE V

b b

(12)

2

21 a

( 2) ( 1)nd

b a bE V

a a a

(13)

Section 3 presents case study results for predicted accident durations using the prior pdfs

presented above.

2.4. Worker unavailability and work time loss model

Figure 4 provides an illustration of the dynamic behavior of a company worker as two-

state continuous Markov process (Howard, 1971). A worker at the first state is working

normally (available to work as required). Workers found at the second state are recovering

from an accident (unavailable due to an accident). A process randomly generates accidents

with rate λ, so that an available worker (state 1) becomes unavailable (state 2). Another

stochastic process generates accident recovery times, so that a worker at state 2 falls back to

state 1 with rate μ. The two rates λ and μ are assumed independent, and remain constant

when the transition distributions are Exponential.

Looking ahead, the probability of a worker occupying state 2 at a particular instance of

time, U(t), gives the probability that the worker will be unavailable for work at t. This

probability converges to a steady-state value, U , formulated as (Howard, 1971; Henley and

Kumamoto, 1981):

lim ( )t

U U t

(14)

Assuming that the rates λ and μ are statistically independent and each is distributed

according to a Gamma pdf, the joint pdf is the product of individual pdf’s. It can be shown

(see Appendix A.1) that, the pdf of the steady state unavailability, U , is given by:

111

( , , , )( , ) 1

aaa a

a a

b b U Uh U a b a b

a a b U b U

(15)

The associated expected value and variance are formulated as (see Appendices A.2-3):

0

1( 1, )

( , )

m

m

E U a m aa a

(16)


page 9

2

2

0

11 ( 2, )

( , )

m

m

V U m a m a E Ua a

(17)

where, 1 b b and ,x y denotes the beta function for *,x y (see notation).

Likewise to the models for κτ and δ, equations (15)(17) provide prior and posterior

statistics of the worker unavailability, according to the quartet of , , ,a b a b values

assumed in the calculations.

The overall work time losses during a given time period, τ, of workplace operation are

calculated as the product of the steady-state unavailability, U and τ. Using equation (14):

2, ,U E E U V V U (18)

Note that, the steady-state assumption becomes valid for 1( ) .

Figure 5 presents prior distributions for ρτ using the quartets of , , and a b a b values

reported on Table 3. As in section 2.2, the time base for ρτ calculations is τ=250,000

workdays. Case U0 assumes Gamma(0.001, 0.001) pdfs. In case U1, the quartets {aλ=aμ=1,

bλ=bμ} yield the uniform distribution for ρτ. As the bλ / bμ ratio increases, the pdf curve for

ρτ shifts towards lower unavailabilities, and case U2 uses bλ / bμ = 103 to get

Pr / 1% 1% . Section 3 presents prediction results for work time losses in a

workplace, using equations (15)(18) and the prior cases of Table 3 and Figure 5.

3. ANALYSIS OF OCCUPATIONAL ACCIDENT DATA

3.1. Accident data

The prior pdfs of the Bayesian models presented in section 2 are updated using available

evidence from real accident data. The derived posterior distributions embed knowledge on

the system, model its future behavior, and allow predictions for accident occurrences and

durations, and work time loss. For an observation period of known total work time, the

sufficient statistics for the models of section 2 are the number of accident occurrences and

their recovery times, as these are recorded in the company databases.

An extended database is used, which comprises data from reported accidents in the Greek

Petrochemical Industry over a multi-year period (Nivolianitou et al., 2006; Konstandinidou

et al., 2006). The collected data were acquired directly from the different sites, and the


page 10

participating companies gave access to their archives and to the initial reports of the

accidents. The sites included refineries, onshore and offshore facilities, storage locations and

extraction sites. The present analysis is based on 406 reported non-fatal accidents, which

occurred at 5 different sites during the years 1997 and 2003. For reasons of confidentiality,

the sites are hereby randomly referred to as A, B, C, D and E. Table 1 aggregates the

available accident data, and reports only the sufficient statistics for the models of section 2.

3.2. Calculations and results

The following calculations consider the cases chosen for the parameters of the Gamma

priors for λ and μ, as presented in section 2 (see also Figures 2,3,5 and Tables 1-3). The prior

choices affect the predictions, as the quartets , , ,a a b b are involved in the calculations

of κτ (equation (3)), δ (equation (9)) and ρτ (equations (15) and (18)).

Figures 6–8 present the posterior results for the five companies and their total, for each

prior case of section 2. Predictions on the number of accident occurrences (Figures 6) and

the work time losses (Figures 8) assume τ = 2.5105 workdays. With the exception of

company B, the differences between the company posterior curves derived using different

priors are negligible. Company B is more sensitive to the prior parameter values, since it has

the shortest record of accidents and the fewest observed workdays. The prediction

differences for this company are particularly evident in the κτ model, whereas the models for

δ and ρτ appear less sensitive to the prior beliefs considered here. To check the validity of

equation (14), U(t) is within 1% of U when 4.6 ( )t (Howard, 1971; Henley and

Kumamoto, 1981). For the sites A, B, C, D and E, this range is reached after 38, 108, 121,

12 and 76 workdays, respectively.

The results reported at Table 5 include the posterior pdf statistics for the predicted

number of accidents κτ, the predicted accident duration δ, and the predicted work time losses

ρτ. Expected values and variances are estimated using the models of Section 2.

Even though the five sites host similar processes and their workers are involved in similar

activities, the analysis yields very different models for the prediction of the number of

accidents, their recovery times and the system unavailabilities. Looking at the expected

values of κτ, δ and ρ reported at Table 5, the number of expected accidents at site C is almost

an order of magnitude higher than the other four sites. The expected recovery times are also

increased, though site B gives slightly higher recovery times (close to a month). The κτ

expected values for sites A and E are quite close. Expected value results for site D are


page 11

located between the predictions for the other sites. Looking at the variances, in δ predictions

appear relatively higher compared to ρτ and especially κτ.

Figure 9 shows the 50% and 95% probability intervals, and the median for each site and

their total, using the κτ model for the L0 prior case. Figure 10 provides similar results using

the δ model, assuming the M0 prior case. The derived intervals verify the results of Table 5,

also illustrating the increased uncertainties in the results for accident duration, compared to

the number of accidents during the base time used here. Figures 11 show the same

probability intervals for the work time losses in the 5 sites and their total, using the models

of section 2.4 for prior case U0. The work time losses in sites C and B (Figure 11 top) are

predicted an order of magnitude higher than the losses in other sites and the total (Figure 11

bottom). The results for site B seem to have the highest uncertainty, which can be attributed

to the scarcity of the relevant data.

3.3. Discussion

The following analysis reveals that the number of accident records, K, is the most

important statistic to control the uncertainty in the predicted estimates. Uncertainty is hereby

quantified using the posterior coefficients of variation, formulated as:

CV V E (19)

The variances, and the resulting coefficients of variation, are high for κτ and even higher

for δ, while the uncertainty in the work time loss predictions is notably lower. Consider the

κτ model in Section 2.2, and sufficient evidence to ensure that the prior values ,a b have

negligible impact on the final predictions. This assumption is valid for sites A, C, D and E,

which have sufficiently large data pools. Then, using equations (7), the posterior coefficient

of variation for κτ takes the form:

CV T R K (20)

Equation (20) indicates that, for given evidence, the value for CV decreases as the

considered time base τ increases, and reaches its minimal value, K–0.5

, at infinite τ. As the

observation time (T) increases and since T R , the uncertainties in the prediction of

accident occurrences become smaller in “riskier” workplaces, i.e. sites with higher K T

ratios. In effect, site C with 410K T has the lowest CV , while sites A and E with

51.5 10K T exhibit the highest CV .

For the analysis of prediction uncertainties in the accident durations, δ, we use the model


page 12

of section 2.3. Assuming again that the prior and a b values are negligible, the posterior

coefficient of variation for δ becomes:

2CV K K (21)

Therefore, CV depends solely on the number of reported accidents, and tends to

unity at increasing K. Equation (21) explains the particularly high variances calculated using

the δ model, which is derived under the assumption that accident recovery times follow

Exponential pdfs (equation (8)). Consideration of pdfs other than the Exponential may

hinder the development of analytical formulae for δ, thus call for Monte Carlo simulations.

The models for worker unavailability and work time losses are derived through the λ and

μ pdfs, thus the models for U and ρτ (section 2.4) do not inherit the high variances observed

in the κτ and δ models. Using equations (16)(17) and assuming negligible prior parameter

values, the posterior coefficients of variation for U and ρτ are both formulated as:

2

0 0

( , ) 1 ( 2, ) ( 1, ) 1m m

m m

CV CV U

K K m K m K K m K

(22)

For negligible and b b , equation (36) yields 1 R T , thus the uncertainties in the

ρτ and U predictions depend solely on K and the R T ratio. Figure 12 presents a set of

CV results from numerical experiments using different values of K and R T . The figure

also shows the limiting curve for 0

lim 1R

T

and the points for the five sites. The limiting

curve gives the maximal variation at each value of K, and is bounded by 2K

, for K > 2.

Again, the number of recorded accidents is proven to be the most important variable to

control the prediction uncertainty for work time loss, among the sufficient statistics of the

present analysis. Note that, these trends are valid for the likelihood and prior pdfs assumed

during the model development in section 2, whereas different assumptions may yield

different models and reveal other trends.

4. CONCLUSIONS

This paper presents a Bayesian approach to the statistical analysis of occupational

accident data, including accident recovery times. Recovery times are considered per accident

and over a period of operation and relevant models are derived. A dynamic two state


page 13

stochastic model is used to derive the worker unavailability statistics, and predict the amount

of time that workers will be recovering from accidents and therefore won’t be available to

perform the job they are paid for. The models developed here can be informed using

available databases of occupational accidents observed in the process industry. Sufficient

statistics to use the models include only the total number of workdays, the work days lost

due to recovering from occupational accidents, and the number of occupational accidents

over the period of observation.

The work discusses various prior pdfs, which are updated using evidence from a database

of over 400 accidents reported in the Greek Petrochemical industry over a multi-year period.

The statistical analysis provides future predictions for the number of accidents, accident

durations and work time loss in each one of the considered sites. The present analysis shows

that the most important statistic to reduce the prediction uncertainties is the number of

observed accidents. Uncertainties in the expected number of accidents are inversely affected

by the totally observed workdays. In sites where there is scarcity of accident data and the

observed workdays are few, the predictions are poor and sensitive to the choice of prior pdf

parameter values. The results also indicate that, the evidence collected at one site, or group

of sites, cannot be used directly to inform predictions at another similar site. Current work

considers the development of Bayesian models to enable integration of evidence collected at

different sites.

APPENDIX: STEADY STATE WORKER UNAVAILABILITY PDF

A.1. Worker unavailability pdf

Assuming that λ and μ are statistically independent and each is distributed according to

Gamma pdfs, the joint pdf is the product of individual pdf’s:

( , | , , , ) ( | , ) ( | , )f a b a b f a b f a b (23)

2

2

11 exp( )exp( )( , )

( ) ( )

aa

a

bbf

b a b a

(24)

Consider the transformation:

( , ) ( , ) ,s v g (25)

Note that, v is the steady state unavailability (see equation (14)).

Let h(s,v) denote the pdf of the random variables s and v, then:


page 14

1( , ) ( , ) det ( , )h s v f g J s w (26)

According to equation (25) λ = vs and μ = (1v)s, and the Jacobean determinant term in

(26) becomes:

( , ) det ( , )1

v ss vJ s w J s w vs s vs s

v ss v

(27)

Using equations (25) to (27), equation (24) is transformed to:

11 ( ) exp( ( ))( ) exp( )( , )

( ) ( )

s v s b s v sv s b v sh s w s

b a b a

1 11(1 ) exp (1 )

( ) ( )

aa

a a aab b

s v v b v b v sa a

(28)

The pdf of the unavailability variable v U is derived by integrating out the variable s:

1 11

0

(1 ) exp (1 )( ) ( )

aa

a a aab b

h v v v s b v b v s dsa a

(29)

Using the Gamma function definition (Forbes et al., 2011), the integration on (29) gives:

11( )

(1 )( ) ( ) (1 )

aa

aa

a a

b b a ah v v v

a a b v b v

(30)

Since ( ) ( )

( , )( )

a aa a

a a

(Forbes et al., 2011), equation (30) finally becomes:

11(1 )

( ; , , , )( , ) (1 )

aa aa

a a

b b v vh v a b a b

a a b v b v

(31)

A.2. Expected value of the worker unavailability pdf

The expected value of U according to equation (15) is given as follows:

1 1

0 0

(1 )1( )

( , ) 1(1 )

aa

a a

b v b v dvE U v h v dv

a a vb v b v

(32)

The E U is derived with the aid of variable z, defined as:

(1 )

b vz

b v b v

(33)

and equation (15) is transformed to give h U as a function of z:


page 15

1 (1 )

( , ) (1 )

aaz z

h Ua a v v

(34)

Based on the definition of variable z in equation (33):

2

1 1(1 ) , 1 and

1 1 1

z zv v dv dz

z z z

(35)

where β is defined as:

1b

b

(36)

Using equations (35)(36), the integral (32) becomes:

11 1

0

1(1 ) (1 )

( , )

aaE U z z z dz

a a

(37)

Note that, when v is distributed according to equation (15), the variable z is distributed

according to a Beta pdf. Therefore, function ( )h U is a generalized-inverted-Beta pdf.

For the calculation of E U , the integral of equation (37) is hereby represented as a

series. Provided that 2 1 , the term 1(1 )z can be replaced by the Maclaurin series:

1

0

(1 )m

m

z z

(38)

The assumption that 2 1 holds for b b . Since bμ is the scale parameter for the

“repair” rate while bλ is the scale parameter for the “failure” rate it is safe to assume that

b b . If for any reason b b then the analysis can be performed using 1 b b .

With the aid of equation (38), the integral in of equation (37) becomes:

1 1

00

1(1 )

( , )

aa mm

m

E U z z dza a

(39)

According to the definition of the Beta function, so the integral terms in equation (39)

give:

0

1( 1, )

( , )

m

m

E U a m aa a

(40)

A.3 Variance of the worker unavailability pdf

Similarly, the unavailability variance V U is formulated as:

1 1

2 22

0 0

( ) ( )V U v E U h v dv v h v dv E U (41)


page 16

With the aid of (33)(35), the integral term in equation (41) yields:

21 11 212

0 0

1( ) 1 1

( , )

aav h v dv z z z dz

a a

(42)

Assuming 2 1 the term 2(1 )z can be replaced by the Maclaurin series:

2

0

(1 ) 1m

m

z m z

(43)

Using equation (43) the integral (42) becomes:

21 1112

00 0

1( ) 1

( , )

aa mm

m

v h v dv z z dza a

(44)

According to the definition of the Beta function, (44) gives:

21

2

00

1( ) 1 ( 2, )

( , )

m

m

v h v dv m a m aa a

(45)

By substituting equation (45) into (41) the variance V U is finally calculated as:

2

2

0

11 ( 2, )

( , )

m

m

V U m a m a E Ua a

(46)

REFERENCES

Ale BJM, Baksteen H, Bellamy LJ, Bloemhof A, Goossens L, Hale A, Mud ML, Oh JIH,

Papazoglou IA, Post J, Whiston JY. Quantifying occupational risk: The development of an

occupational risk model. Safety Science 2008; 46(2): 176-185

Bellamy LJ, Ale BJM, Geyer TAW, Goossens LHJ, Hale AR, Oh J, Mud M, Bloemhof A,

Papazoglou IA, Whiston JY. Storybuilder – A tool for the analysis of accident reports.

Reliability Engineering and System Safety 2007; 92(6): 735-744

Blanch A, Torrelles B, Salinas JA. Age and lost working days as a result of an occupational

accident: A study in a shiftwork rotation system. Safety Science 2009; 47(10): 1359-1363

Carnero MC, Pedregal DJ. Modelling and forecasting occupational accidents of different

severity levels in Spain, Reliability Engineering and System Safety 2010; 95(11): 1134-1141

European Commission. Health and safety at work: EU Strategy 2007-2012. Last retrieved:

June 20 2011, from: http://ec.europa.eu/social/main.jsp?catId=151&langId=en

Eurostat. Accidents at work (ESAW) - until 2007 (hsw_acc7_work). Last update: 04-05-

http://ec.europa.eu/social/main.jsp?catId=151&langId=en


page 17

2011. Last retrieved: 4 October 2011, from:

http://epp.eurostat.ec.europa.eu/portal/page/portal/statistics/search_database

Forbes C, Evans M, Hastings N, Peacock B. Statistical Distributions. 4th ed. Hoboken, New

Jersey: John Wiley & Sons, Inc.; 2011.

Henley EJ, Kumamoto H. Reliability Engineering and Risk Assessment. New York:

Prentice-Hall; 1981

Hora SC, Iman RL. Bayesian modeling of initiating event frequencies at nuclear power

plants. Risk Analysis 1990; 10(1): 103-109

Howard RA. Dynamic Probabilistic Systems. New York: John Wiley & Sons, Inc.; 1971

Jacinto C, Soares CG. The added value of the new ESAW/ Eurostat variables in accident

analysis in the mining and quarrying industry. Journal of Safety Research 2008; 39(6): 631-

644.

Jallon R, Imbeau D, De Marcellis-Warin N. Development of an indirect-cost calculation

model suitable for workplace use. Journal of Safety Research 2011; 42(3): 149-164

Konkolewsky H-H. Preventing accidents at work. Magazine of the European Agency for

Safety and Health at Work 2001; 4: 1. Last retrieved: June 20 2011, from:

http://osha.europa.eu/en/publications/magazine/4

Konstandinidou M, Nivolianitou Z, Kefalogianni E, Caroni C. In-depth analysis of the

causal factors of incidents reported in the Greek petrochemical industry. Reliability

Engineering and System Safety 2011; 96: 1448–1455

Konstandinidou M, Nivolianitou Z, Markatos N, Kiranoudis C. Statistical analysis of

incidents reported in the Greek Petrochemical Industry for the period 1997–2003. Journal of

Hazardous Materials 2006; A135: 1–9. [doi:10.1016/j.jhazmat.2005.10.059]

Lambert PC, Sutton AJ, Burton PR, Abrams KR, Jones DR. How vague is vague? A

simulation study of the impact of the use of vague prior distributions in MCMC using

WinBUGS. Statistics in Medicine 2005; 24 (15): 2401–2428. [doi:10.1002/sim.2112].

Marcoulaki EC, Konstandinidou M, Papazoglou IA. Dynamic failure assessment of incidents

reported in the Greek Petrochemical Industry. Computer-Aided Chemical Engineering 2011;

29: 1055-1059. [doi:10.1016/B978-0-444-53711-9.50211-X]

Marhavilas PK, Koulouriotis D, Gemeni V. Risk analysis and assessment methodologies in

the work sites: On a review, classification and comparative study of the scientific literature

http://epp.eurostat.ec.europa.eu/portal/page/portal/statistics/search_database

http://osha.europa.eu/en/publications/magazine/4


page 18

of the period 2000-2009. Journal of Loss Prevention in the Process Industries 2011; 24(5):

477-523

Meel A, O’Neill LM, Levin JH, Seider WD, Oktem U, Keren N. Operational risk assessment

of chemical industries by exploiting incident databases. Journal of Loss Prevention in the

Process Industries 2007; 20: 113–127. [doi:10.1016/j.jlp.2006.10.003]

Meel A, Seider WD. Plant-specific dynamic failure assessment using Bayesian theory.

Chemical Engineering Science 2006; 61: 7036 – 7056. [doi:10.1016/j.ces.2006.07.007]

Nivolianitou Z, Konstandinidou M, Kiranoudis C, Markatos N. Development of a database

for accidents and incidents in the Greek petrochemical industry. Journal of Loss Prevention

in the Process Industries 2006; 19(6): 630-638. [doi:10.1016/j.jlp.2006.03.004]

Parejo-Moscoso JM, Rubio-Romero JC, Pérez-Canto S. Occupational accident rate in olive

oil mills. Safety Science 2011; In Press doi:10.1016/j.ssci.2011.08.064

NOTATION

ax = gamma pdf shape parameter for stochastic variable x

bx = gamma pdf rate parameter for stochastic variable x

11 1

0( , ) (1 ) for , x yx y z z x ydz

= Beta function (Forbes et al., 2011)

CV x V x E x = coefficient of variation for stochastic variable x

E = evidence

, ,nn k nE t s = evidence for λ

, nn kE r = evidence for μ

E x = expected value of stochastic variable x

( | )f x = conditional pdf of stochastic variable vector x given parameter vector

1 1, ( ) for x x xa

x

a

x

b

x x

xe bf x a b x xa

= Gamma pdf (Forbes et al., 2011)

( )g x = prior pdf of variable vector x

g x E = posterior pdf of variable x given evidence E (can be simplified to g x )

1

N

n

n

K K

= total number of accidents of the N workers over time T

Kn = number of accidents occurring to worker n between TS,n and TF,n


page 19

kn = index for the accidents of worker n, with kn [1, Kn]

( )L E = likelihood of evidence E given parameter vector θ

N = number of workers (employees and contractors) working for the company

n = index for company worker, with n [1, N]

,

1 1

n

n

n

KN

n k

n k

R r

= total time loss due to all the occupational accidents, in workdays

, nn kr = duration of recovery of worker n from accident kn, in workdays

sn = time interval of worker n between the time of recovery from accident Kn and TF,n, in

workdays

, ,

1

N

F n S n

n

T T T

= total work time, for all the N workers, , in workdays

TF = finishing time for the period of observation, in workdays

TF,n = time that worker n stops working at the company, in workdays

, , , ,

1

n

n n

n

K

n F n S n n k n k n

k

T T T t r s

= total work time of worker n, , in workdays

, nn kt = time interval between either TS,n (if kn=1), or the end of recovery from accident kn1

(if kn>1), and the occurrence of accident kn of worker n, in workdays

TS = starting time for the period of observation, in workdays

TS,n = time that worker n starts working at the company, in workdays

U = average value of the worker unavailability over a base time period τ

U = steady state value of worker unavailability reached at infinite time

U(t) = probability that the worker will be unavailable for work at time instance t

V x = variance of stochastic variable x

x = prior instance of entity x

x = posterior instance of entity x

1

0( ) d for x tx t e xt

(Gamma function, Forbes et al., 2011)

δ = predicted duration of recovery from an accident, in workdays

, = vectors of pdf parameters

κτ = predicted number of accidents over a base time period τ

λ = rate of accident occurrence parameter, in accidents workdays -1

μ = rate of accident recovery parameter, in workdays -1


page 20

ρτ = overall work time losses during a given time period, τ, in workdays

τ = base time period for the prediction of number of accidents, worker unavailability and

work time loss, in workdays

TABLE CAPTIONS

Table 1: Prior pdf parameters for the κτ model (section 2.2)

Table 2: Prior pdf parameters for the δ model (section 2.3)

Table 3: Prior pdf parameters for the ρτ model (section 2.4)

Table 4: Occupational accident data

Table 5: Posterior pdf statistics for κτ, δ and ρτ

FIGURE CAPTIONS

Figure 1: Timelines for a workplace with N workers

Figure 2: Prior pdfs for the number of accidents during 250,000 workdays

Figure 3: Prior pdfs for the duration of accident recovery δ

Figure 4: Stochastic model for worker unavailability

Figure 5: Prior pdfs for the work time losses during 250,000 workdays

Figure 6: Posterior pdfs for the number of accidents during 250,000 workdays

Figure 7: Posterior pdfs for the duration of accident recovery δ

Figure 8: Posterior pdfs for the work time losses during 250,000 workdays

Figure 9: Probability intervals for the predicted number of accidents during 250,000

workdays

Figure 10: Probability intervals for the predicted duration of accident recovery

Figure 11: Probability intervals for the predicted work time losses during 250,000

workdays

Figure 12: Uncertainty in the work time loss predictions

Table 1: Prior pdf parameters for the κτ model (section 2.2)

L0 L1 L2 L3

aλ 10-3 3.358 1.117 3.358

bλ (days) 10-3 7787 393 3894

λ1⋅τ (accidents) – 104 104 5×103

Pr (λ ≥ λ1) – 2.50% 2.50% 2.50%

λ2⋅τ (accidents) – 103 102 5×102

Pr (λ ≤ λ2) – 2.50% 2.50% 2.50%

Table 2: Prior pdf parameters for the δ model (section 2.3)

M0 M1 M2 M3

aµ 10-3 1.509 0.5405 2.212

bµ (days) 10-3 3.577 0.7880 8.872

11µ− (days) – 20 250 20

Pr (µ ≤ µ1) – 5% 5% 5%

12µ− (days) – 3 3 3

Pr (µ ≥ µ2) – 5% 5% 75%

Table 3: Prior pdf parameters for the ρτ model (section 2.4)

U0 U1 U2

aλ 10-3 1 1

bλ 10-3 1 1

aµ 10-3 1 1

bµ 10-3 1 10-4

Table 4: Occupational accident data

Site A Site B Site C Site D Site E

number of accidents 110 5 130 66 95

number of workdays lost 909 117 3424 165 1559

total workdays 7,527,422 161,079 1,238,019 2,271,740 6,452,076

Tables

Table 5: Posterior pdf statistics for κτ, δ and ρτ

model κτ, number of accidents* δ, accident duration (workdays) ρτ, work time loss (workdays)*

prior case L0 L1 L2 L3 M0 M1 M2 M3 U0 U1 U2

Predicted average E τκ′′ , eq. (9) E δ′′ , eq. (16) E τρ′′ , eq. (21), (23)

company A 3.654 3.761 3.691 3.763 8.339 8.258 8.305 8.253 30.47 30.50 30.46

company B 7.767 12.38 9.477 12.68 29.24 21.89 25.94 20.26 226.8 219.6 217.8

company C 26.32 26.84 26.54 26.92 26.54 26.26 26.44 26.16 696.7 696.9 696.7

company D 7.264 7.607 7.385 7.620 2.538 2.535 2.530 2.587 18.44 18.54 18.43

company E 3.682 3.807 3.725 3.810 16.60 16.37 16.51 16.31 61.09 61.12 61.08

Total 5.753 5.798 5.768 5.799 15.25 15.20 15.23 15.19 87.68 87.69 87.68

Predicted variance V τκ′′ , eq. (9) V δ′′ , eq. (16) V U∞′′ , eq. (22), (23)

company A 3.775 3.886 3.813 3.888 70.83 69.44 70.25 69.35 17.11 16.98 16.95

company B 19.83 30.73 24.16 31.90 1425 691.4 1053 568.1 3.068×103 2.202×103 2.165×103

company C 31.66 32.24 31.92 32.35 715.5 700.4 709.8 695.0 7513. 7458. 7454.

company D 8.063 8.442 8.198 8.457 6.645 6.620 6.597 6.894 10.54 10.50 10.37

company E 3.825 3.955 3.869 3.957 281.3 273.7 278.4 271.5 79.79 79.03 78.93

Total 5.834 5.880 5.850 5.881 233.6 232.2 233.1 231.7 37.98 37.90 37.89

* τ = 250,000workdays

Figure 1: Timelines for a workplace with N workers

TS,2 t2,1 r2,1 s2 TF,2

TS,1 t1,1 r1,1 t1,2 r1,2 s1 TF,1

TS,3 s3 TF,3

TS,4 t4,1 r2,1 TS,4

TS,5 t5,1 r5,1=0 s5 TF,5

TS,6 t6,1 r6,1 s6 TF,6

TS,7 s7 TF,7

tS,8 t8,1 r8,1 s8 TF,8

TS,9 t9,1 r9,1 s9 TF,9

1

2

3

4

5

6

7

8

9

start (TS,n) accident/incident back to work finish (TF,n) work time (tn,k; sn) recovery time (rn,k)

TS TF

Figure(s)

Figure 2: Prior pdfs for the number of accidents during 250,000 days

0.E+00

2.E-03

4.E-03

6.E-03

8.E-03

0 200 400 600 800 1000number of accidents per 250,000 workdays, prior pdfs

pro

bab

ilit

y d

ensi

ty

L0

L1

L2

L3

Figure 3: Prior pdfs for duration of accident recovery δ

1.E-06

1.E-04

1.E-02

1.E+00

0 20 40 60 80 100accident duration (workdays), prior pdfs

pro

bab

ilit

y d

ensi

ty

M0

M1

M2

M3

Figure 4: Stochastic model for worker unavailability

random

process 1

random

process 2

failure rate λ

repair rate μ

Figure 5: Prior pdfs for the work time losses during 250,000 workdays

0.01

0.1

1

10

100

1000

10000

0 1000 2000 3000 4000 5000

work time loss (workdays), prior pdfs

pro

bab

ilit

y d

ensi

ty

U0

U1

U2

Figure 6: Posterior pdfs for the number of accidents during 250,000 days

0.00

0.02

0.04

0.06

0.08

0.10

0.12

0 5 10 15 20 25

number of accidents per 250,000 workdays, site B

pro

bab

ilit

y d

ensi

ty

L0

L1

L2

L3

0.00

0.03

0.06

0.09

0.12

0.15

0 4 8 12 16

number of accidents per 250,000 workdays, site D

pro

bab

ilit

y d

ensi

ty

0.00

0.05

0.10

0.15

0.20

0 4 8 12 16number of accidents per 250,000 workdays, total

pro

bab

ilit

y d

ensi

ty0.00

0.05

0.10

0.15

0.20

0.25

0 3 6 9 12

number of accidents per 250,000 workdays, site A

pro

bab

ilit

y d

ensi

ty

0

0.03

0.06

0.09

5 15 25 35 45

number of accidents per 250,000 workdays, site C

pro

bab

ilit

y d

ensi

ty

0.00

0.05

0.10

0.15

0.20

0.25

0 3 6 9 12number of accidents per 250,000 workdays, site E

pro

bab

ilit

y d

ensi

ty

Figure 7: Posterior pdfs for the duration of accident recovery δ

0.00

0.01

0.02

0.03

0.04

0.05

0.06

0 20 40 60 80 100 120

accident duration (workdays), company B

pro

bab

ilit

y d

ensi

ty

M0

M1

M2

M3

0.00

0.10

0.20

0.30

0.40

0.50

0 2 4 6 8 10 12

accident duration (workdays), company D

pro

bab

ilit

y d

ensi

ty

0.00

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0 20 40 60 80

accident duration (workdays), total

pro

bab

ilit

y d

ensi

ty0.00

0.02

0.04

0.06

0.08

0.10

0.12

0 10 20 30 40 50

accident duration (workdays), company A

pro

bab

ilit

y d

ensi

ty

0.00

0.01

0.02

0.03

0.04

0 20 40 60 80 100

accident duration (workdays), company C

pro

bab

ilit

y d

ensi

ty

0.00

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0 20 40 60 80

accident duration (workdays), company E

pro

bab

ilit

y d

ensi

ty

Figure 8: Posterior pdfs for the work time losses during 250,000 workdays

0

200

400

600

800

1000

1200

0 100 200 300 400 500 600

worktime loss (workdays), company B

0

5000

10000

15000

20000

25000

30000

35000

10 15 20 25 30worktime loss (workdays), company D

pro

bab

ilit

y d

ensi

ty

0

2500

5000

7500

10000

12500

15000

17500

70 80 90 100 110

worktime loss (workdays), total

pro

bab

ilit

y d

ensi

ty0

5000

10000

15000

20000

25000

20 25 30 35 40 45 50

worktime loss (workdays), company A

pro

bab

ilit

y d

ensi

ty

U0

U1

U2

0

250

500

750

1000

1250

400 500 600 700 800 900 1000

worktime loss (workdays), company C

pro

bab

ilit

y d

ensi

ty

0

2000

4000

6000

8000

10000

12000

14000

30 40 50 60 70 80 90worktime loss (workdays), company E

pro

bab

ilit

y d

ensi

ty

Figure 9: Probability intervals for the predicted number of accidents during 250,000

workdays

0 5 10 15 20 25 30

A

B

C

D

E

Total

number of accidents during 250,000 workdays

50% range

95% range

Median

Figure 10: Probability intervals for the predicted accident duration

0 20 40 60 80 100 120 140

E

B

C

D

E

Total

accident duration (workdays)

50% range

95% range

Median

Figure 11: Probability intervals for the predicted work time losses during 250,000

workdays

0 100 200 300 400 500 600 700 800 900

A

B

C

D

E

Total

50% range

95% range

10 20 30 40 50 60 70 80 90 100

A

B

C

D

E

Total

work time loss (workdays) during 250,000 workdays

Median

Figure 12: Uncertainty in the work time loss predictions

0.0

0.2

0.4

0.6

0.8

1.0

0 20 40 60 80 100 120 140

number of reported accidents (K )

po

ster

ior

CV

fo

r w

ork

tim

e lo

ss,

0

max at lim 1R

T

CV

0.52 K

β = 0.990

β = 0.900

site dataCV

B

DE A C

Prediction of occupational accident statistics and work time...

Documents

Transcript of Prediction of occupational accident statistics and work time...