By XIAOYU ZHU - University of Floridaufdcimages.uflib.ufl.edu/UF/E0/04/27/14/00001/zhu_x.pdfXIAOYU...

AN ANALYSIS OF INJURY SEVERITIES OF LARGE-TRUCK CRASHES

By

XIAOYU ZHU

A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT

OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY

UNIVERSITY OF FLORIDA

2011

2

© 2011 Xiaoyu Zhu

3

To my parents

4

ACKNOWLEDGMENTS

I would like to take this opportunity to thank my parents. They are always doing

their best to provide me opportunities to pursue my goal. Their endless support and

encouragement lead me through every step in my life.

I would also like to thank the faculty members at the University of Florida (UF),

who have provided me the huge amount of knowledge and skills during my Ph.D. study.

I would like to thank my advisor Dr. Siva Srinivasan for constantly being a source of

inspiration. He provided not only the foundation of this research, but also approaches to

a successful research, which will be a lifelong benefit. I would like to thank Dr. Lily

Elefteriadou for being one of my outstanding examples as women professors, Dr. Scott

Washburn for his positive attitude towards life and work, and Dr. Yafeng Yin for his

constant enthusiasms and innovations in the research. I also would like to thank Dr.

Chunrong Ai and Dr. Trevor Park for their helpful comments from various perspectives

to make this research in a broader context.

Special thanks go to Neng Fan for his support and accompanying in both my life

and study during these four years. I would also like to thank all my friends and

colleagues for making my life in Gainesville enjoyable.

5

TABLE OF CONTENTS page

ACKNOWLEDGMENTS .................................................................................................. 4

LIST OF TABLES ............................................................................................................ 8

LIST OF FIGURES ........................................................................................................ 10

ABSTRACT ................................................................................................................... 11

CHAPTER

1 INTRODUCTION .................................................................................................... 13

1.1 Background: Traffic Safety and Large Trucks ................................................... 13 1.2 Objectives of the Research ............................................................................... 15 1.3 Organization of the Document .......................................................................... 16

2 LITERATURE REVIEW .......................................................................................... 17

2.1 Research on Large-Truck Crashes ................................................................... 17 2.2 Research on Modeling Injury-Severity of Automobile Crashes ......................... 21

2.2.1 Injury of Interest ....................................................................................... 22 2.2.2 Levels of Injury Severity .......................................................................... 23 2.2.3 Data Sources ........................................................................................... 24 2.2.4 Modeling Method ..................................................................................... 25

2.2.4.1 Treatment of ordinal ....................................................................... 25 2.2.4.2 Incorporating interdependencies among the injuries of all

persons involved in the crash ................................................................. 29 2.2.5 Explanatory Factors................................................................................. 30

2.2.5.1 Crash characteristics...................................................................... 30 2.2.5.2 Vehicle characteristics ................................................................... 32 2.2.5.3 Driver and occupant characteristics ............................................... 33 2.2.5.4 Environmental characteristics ........................................................ 35 2.2.5.5 Roadway characteristics ................................................................ 36 2.2.5.6 Occupant protection ....................................................................... 38

2.3 Contribution of this Dissertation ........................................................................ 39

3 DATA ...................................................................................................................... 41

3.1 Data Source and Raw LTCCS Data Characteristics ......................................... 41 3.2 Sample Formation Procedure ........................................................................... 42

3.2.1 Selecting Cases ...................................................................................... 42 3.2.2 Cleaning and Consistency Checking ....................................................... 42 3.2.3 Variable Selection ................................................................................... 43

3.2.3.1 Crosstab check .............................................................................. 43

6

3.2.3.2 Classification analysis .................................................................... 44 3.2.3.3 Missing data ................................................................................... 44

3.3 Sample Characteristics ..................................................................................... 47

4 MODELS FOR CRASH LEVEL INJURY ................................................................. 48

4.1 Sample Data ..................................................................................................... 48 4.2 Methodology ..................................................................................................... 51 4.3 Empirical Results .............................................................................................. 52

4.3.1 Crash-level Variables ........................................................................ 53 4.3.2 Truck-level Variables ......................................................................... 55 4.3.3 Car-level Variables ............................................................................ 58

4.4 Contributions ..................................................................................................... 65

5 THE PANEL HETEROSKEDASTIC ORDERED PROBIT MODEL FOR OCCUPANT-lEVEL INJURY SEVERITY STUDY ................................................... 67

5.1 An Exploratory Analysis of Occupant-level and Crash-level Injury Severities ... 67 5.2 Sample Data ..................................................................................................... 75 5.3 Methodology ..................................................................................................... 76 5.4 Empirical Results .............................................................................................. 79

5.4.1 Truck Occupants ............................................................................... 81 5.4.2 Car Drivers ........................................................................................ 83 5.4.3 Car Passengers ................................................................................ 85

5.5 Contributions ..................................................................................................... 93

6 THE PANEL HETEROSKEDASTIC ORDERED GENERALIZED EXTREME VALUE MODEL IN INJURY SEVERITY STUDY .................................................... 95

6.1 Background ....................................................................................................... 95 6.2 Methodology ..................................................................................................... 97 6.3 Empirical Result ................................................................................................ 99

6.3.1 Truck Occupants ............................................................................. 101 6.3.2 Car Drivers ...................................................................................... 102 6.3.3 Car Passengers .............................................................................. 104 6.3.4 Application and Sensitivity Testing .................................................. 105

6.4 Contributions ................................................................................................... 108

7 SUMMARY AND CONCLUSIONS ........................................................................ 121

7.1 Contributions of the Dissertation ..................................................................... 122 7.1.1 Methodological Contributions .......................................................... 122 7.1.2 Empirical Contributions ................................................................... 123

7.2 Directions for Further Research ...................................................................... 126

APPENDIX DESCRIPTIONS FOR THE VARIABLES ................................................. 128

LIST OF REFERENCES ............................................................................................. 136

7

BIOGRAPHICAL SKETCH .......................................................................................... 141

8

LIST OF TABLES

Table page 3-1 Sample size for each category. .......................................................................... 47

4-1 Cross tabulation of police-determined and researcher–determined injury severity levels. .................................................................................................... 50

4-2 Empirical model results: effects of crash-level variables. ................................... 62

4-3 Empirical model results: effects of truck-level variables. .................................... 63

4-4 Empirical model results: effects of car-level variables. ....................................... 64

5-1 List of possible combinations of occupant‘s injury .............................................. 74

5-2 Factors affecting the injury severity of truck occupants. ..................................... 89

5-3 Factors affecting the injury severity of car drivers. .............................................. 90

5-4 Factors affecting the injury severity of car passengers. ...................................... 91

6-1 Model comparison. ........................................................................................... 110

6-2 Standard deviation of intra-vehicle correlation term. ......................................... 110

6-3 Factors affecting the injury severity of truck occupants. ................................... 111

6-4 Factors affecting the injury severity of car drivers. ............................................ 112

6-5 Factors affecting the injury severity of car passengers. .................................... 113

6-6 List of sensitivity test. ........................................................................................ 114

6-7 Sensitivity test: effect of airbags on the injury severity of truck drivers in truck-car head on crashes. ........................................................................................ 114

6-8 Sensitivity test: effect of truck driver behavior on the injury severity of truck drivers in truck-car head on crashes. ................................................................ 115

6-9 Sensitivity test: effect of seatbelts on the injury severity of truck drivers in truck-car head on crashes. ............................................................................... 115

6-10 Sensitivity test: effect of crash type on the injury severity of truck drivers. ....... 116

6-11 Sensitivity test: effect of airbag deployment on the injury severity of car drivers in truck-car head on crashes. ................................................................ 116

9

6-12 Sensitivity test: effect of airbag availability on the injury severity of car drivers in truck-car head on crashes. ........................................................................... 117

6-13 Sensitivity test: effect of alcohol on the injury severity of car drivers in truck-car head on crashes. ........................................................................................ 117

6-14 Sensitivity test: effect of crash type on the injury severity of car drivers. .......... 118

6-15 Sensitivity test: effect of airbag deployment on the injury severity of car passengers in truck-car head on crashes. ........................................................ 118

6-16 Sensitivity test: effect of airbag availability on the injury severity of car passengers in truck-car head on crashes. ........................................................ 119

6-17 Sensitivity test: effect of car driver distraction on the injury severity of car passengers in truck-car head on crashes. ........................................................ 119

6-18 Sensitivity test: effect of truck driver DUI on the injury severity of car passengers in truck-car head on crashes. ........................................................ 120

A-1 Injury severity characteristics............................................................................ 128

A-2 Crash characteristics (Crash level). .................................................................. 128

A-3 Crash characteristics (Vehicle level). ................................................................ 129

A-4 Truck driver characteristics. .............................................................................. 131

A-5 Car driver characteristics. ................................................................................. 132

A-6 Truck characteristics. ........................................................................................ 133

A-7 Car characteristics. ........................................................................................... 133

A-8 Truck occupant characteristics. ........................................................................ 134

A-9 Car occupant characteristics. ........................................................................... 134

A-10 Carrier characteristics. ...................................................................................... 135

10

LIST OF FIGURES

Figure page 5-1 A cross-tabulation of cumulative injury severity and highest injury severity ........ 69

5-2 Distribution of the cumulative injury severities by highest injury severity ............ 69

5-3 Cross tabulations of cumulative injury cost against the highest injury severity ... 70

5-4 Cross tabulation of average injury severity against number of occupants by HIC value ............................................................................................................ 73

5-5 Distribution of injury severity levels by occupant type ......................................... 75

11

Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy

AN ANALYSIS OF INJURY SEVERITIES OF LARGE-TRUCK CRASHES

By

Xiaoyu Zhu

May 2011

Chair: Sivaramakrishnan Srinivasan Major: Civil Engineering

Traffic crashes have become one of the largest public health problems in the world

and will be one of the most concerned transportation issues in the future. The

importance of trucking to freight logistics and, consequently, its impact on the economic

well being of a nation is well acknowledged. There is a need for studying crashes

towards improving the safety of the transportation system, educating driver behavior,

enhancing carrier operation and incident cost reduction. Data from the Large Truck

Crash Causation Study (LTCCS) is used in the empirical analysis.

The goal to develop econometric models of injury-severity in large-truck crashes is

accomplished in a three-step procedure. The first step of this dissertation contributes

towards that end by undertaking the relationship between injury severity and a vast

number of inter-dependent explanatory factors using a crash level sample. The injury

severity is modeled using both police-reported and researcher-determined scales. The

results indicate the strong impacts of several Crash-, Truck, and Car-level variables on

the severity of the crashes.

12

Then we proceed to the occupant level study in the second step because the

highest injury severity cannot fully represent the severity of the whole crash. The

methodology of incorporating the effects of common unobserved factors (error

correlations) affecting the injury-severity of all persons involved in the same vehicle and

crash is developed. Both the intra-crash and intra-vehicle correlations are confirmed to

be important in the second step.

A more advanced and flexible structure of methodology is explored in the last step.

This approach is attractive as it recognizes the ordered nature of the choice alternatives

and, at the same time, it is not constrained by the ―proportional odds‖ or ―parallel line‖

restrictions of the ordered probit. The results indicate that the variables which are not

significant in ordered probit model may have impact on the injury severity. For different

roles (truck occupant, car driver and car passenger), the significant driver behavior

variables are also different.

In summary, the advanced and flexible methodologies for occupant level injury

severity study are developed and compared in this dissertation. The results and

implications are useful from the standpoints of traveler, transportation engineer and

policy maker.

13

CHAPTER 1 INTRODUCTION

This chapter motivates the need to study large-truck crashes and outlines the

objectives of the dissertation. The organization of the rest of this document is also

presented.

1.1 Background: Traffic Safety and Large Trucks

According to the World Health Organization, more than a million people are killed

on the world‘s roads each year (Leonard, 2004). In the United States, 26,689 occupants

(drivers and passengers) died and an additional 2,120,000 were injured in the

5,811,000 crashes in 2008 according to the National Highway Traffic Safety

Administration, (NHTSA, 2009). Clearly, these numbers highlight that traffic safety is

one of the critical public-health and transportation problems in the world.

Among all motor-vehicle traffic crashes, the focus of this study is on crashes

involving large-trucks. A large truck is defined as a commercial vehicle weighing more

than 10,000 lbs. The importance of trucking to freight logistics and, consequently, its

impact on the economic well being of a nation is well acknowledged in literature.

Specifically, based on the 2007 Commodity Flow Survey results, among all the modes,

trucks moved 74.3% of all freight by value, 67.2% by weight, and 40% by ton-miles

(USDOT/BTS, 2004). These large volumes of truck traffic, the unique operating

characteristics of the trucks and drivers, and the design and weight of trucks have

resulted in large numbers of crashes, injuries, and fatalities.

In 2005, over 5000 people died and an additional 114,000 were injured in the

442,000 large-truck crashes in the United States. Approximately 12% of all traffic

fatalities involved a large-truck crash (NHTSA, 2006). In 2007, 413,000 large trucks

http://en.wikipedia.org/wiki/World_Health_Organization

14

were involved in traffic crashes resulting in 4,808 fatalities, which are 12% of the total

fatality (NHTSA, 2008). Large trucks account for approximately 4% of all the vehicles

but are about 8% of vehicles in fatal crashes. 75% of the fatalities that resulted from

crashes involving large trucks were occupants of other vehicles. In addition to all the

above cross-sectional statistics, time-series trends reported by Lyman and Braver

(2003) are also illuminating. Based on aggregate data from 1975 to 1999, these authors

find that the involvement of large-trucks in fatal crashes per truck vehicle-mile-traveled

has decreased. However, with a corresponding increase in the volume of truck travel,

the involvement per unit population has not seen the same declining trend. Thus, there

is continued public concern about large-truck crashes.

Of all the injury- and fatal- crashes in 2008, 566,554 (34%) were single-vehicle

crashes and 1,097,463 (66%) were multi-vehicle crashes. This indicates that in at least

2/3rd of the injury- and fatal- crashes, more than one person is involved (even single-

vehicle crashes can have more than one person involved). Among all crashes with at

least one injury, there are, on an average, 1.29 persons injured or killed per crash.

These statistics indicate that a large number of crashes involve more than one person

and in many cases from multiple vehicles. Despite these results, a vast majority of

literature on the crash severity has focused only on the highest–level of severity rather

than on the injuries sustained by the different persons involved in the crash. Arguably,

one of the major reasons for the state-of-practice approach is that the highest severity

of the crash is more reliably recorded than the severities sustained by individual

persons (Chang and Mannering, 1999).

15

The above statistics clearly underscore the need for studying large-truck crashes

towards improving the safety of the transportation system. The results from such studies

will be valuable in transportation policy, improvement of carrier operation, and incident-

cost reduction. The broad goal of this dissertation is to contribute towards that end.

Specifically, data from a recent, nationally-representative sample of large-truck crashes

will be analyzed to determine the factors affecting the injury severity of these crashes.

1.2 Objectives of the Research

The objective of this study is to develop econometric models of injury-severity in

large-truck crashes. The models developed will facilitate the evaluation of a variety of

countermeasures from the stand points of transportation control, roadway design, traffic

operations, and carrier management aimed at improving safety. Despite the importance

of truck-safety, research on understanding the relative magnitudes of the influences of

the various factors affecting the injury-severity of such crashes is limited. The models

estimated in this study will include a comprehensive set of explanatory factors including

the characteristics of the crash, vehicle, truck-carrier, and the occupants.

This dissertation will also contribute methodologically to the literature on injury-

severity modeling. Two important enhancements will be incorporated. First, the use of

advanced, flexible structures such as the Ordered Generalized Extreme Value (OGEV)

model will be explored to replace the simpler and restrictive Ordered-Probit models

conventionally used in the injury-severity literature. Second, the effects of common

unobserved factors (error correlations) affecting all the injury-severity of all persons

involved in the same crash will be incorporate in the models developed in this

dissertation. Most research to date either ignore this effect (even though many crashes

involve more than one person) or focus on the injury to one particular person (such as

16

the car driver) involved in the crash. The contributions of this dissertation are discussed

further in Chapter 2.

1.3 Organization of the Document

A brief synthesis of the relevant literature is presented in Chapter 2. A detailed

description of the data and the sample-formation procedure is outlined in Chapter 3.

Chapter 4 presents the results of the crash-level ordered-probit models for injury

severity. The empirical results capturing the effects of several explanatory factors

including driver, vehicle, crash, environment and carrier are discussed. These models

will serve as the basis for further advanced specifications. Chapter 5 discuss how the

highest-level of injury sustained may not be a comprehensive descriptor of the overall

severity of the crash and present a methodology to simultaneously model the severity

sustained by all persons involved in the crash. In Chapter 6, we continue the exploratory

analysis of injury severity of each occupant with a panel, hetroskedastic Ordered

Generalized Extreme Value (OGEV) model, to release the constraint of ordered probit

model. A summary and conclusion of the key contributions of this dissertation and future

research is discussed in Chapter 7.

CHAPTER 2 LITERATURE REVIEW

This chapter presents a synthesis of literature relevant to the dissertation‘s

objective of modeling injury-severity in large-truck crashes. The rest of this chapter is

organized as follows. A review of past research on large-truck crashes is presented in

Section 2.1. A summary of studies on modeling injury-severity in automobile crashes, in

general, is presented in Section 2.2. Significant emphasis is placed on the modeling

methods employed and the key empirical results. Section 2.3 positions the dissertation

in the context of past research by identifying the gaps in knowledge and the

contributions of this study.

2.1 Research on Large-Truck Crashes

A brief synthesis of literature on large-truck (gross vehicle weight rating greater

than 10,000 pounds) crashes, with particular focus on the analysis of injury-severity of

such crashes is presented in this section of the chapter.

Work undertaken by Khattak and colleagues (Duncan et al., 1998, Khattak et al.,

2002) and Chang and Mannering (1999) are most directly related to our efforts.

Duncan et al. (1998) examined the injury severity in the case of rear-end collisions

between heavy trucks and passenger cars. The focus was on modeling the injury to the

passenger-car occupants as they are almost always likely to sustain more severe

injuries than truck drivers in crashes with large/heavy trucks. Ordered-probit models

were developed using the Highway Safety Information System (HSIS) data from North

Carolina for the years 1993-1995. The results indicate that higher speeds (and speed

differentials), darkness, and grade increase the severity of the injury. Females and

drunk-drivers were estimated to sustain more severe injuries compared to male and non

18

drunk drivers respectively. Snowy/icy road conditions and traffic congestion were found

to decrease the effect of the injury severity compared to respectively dry and free-flow

traffic conditions. Finally, the car being struck in the rear was found to lead to more

severe injuries compared to the truck being struck in the rear.

Khattak et al. (2002) used the HSIS data from North Carolina for the years 1996-

1998 to examine the injury severity of single large-truck crashes. In particular, the intent

was to examine the differences between rollover and non-rollover crashes. Using

ordered-probit models, the authors found that rollover leads to more severe injuries in

single-truck crashes. Further, dangerous driving behaviors such as drug/alcohol use,

and speeding, not wearing seat-belts increases the injury severity. Crashes that result in

fire are also estimated to have a greater injury severity. In this study, the authors

continued to examine the factors affecting the roll-over of trucks in single-truck crashes.

The researchers found that rollovers are more likely to happen at a right, left or U-turn

and on a curved road. Trucks with longer trailers are more likely to roll over. Reckless

driving has the largest influence on increasing rollover propensity. These factors may

also be construed as affecting the injury severity as the roller-over crashes were

established to be more severe than non roll-over crashes.

Chang and Mannering (1999) modeled the vehicle occupancy and the most-

severe injury sustained by an occupant of the vehicle using data from the state of

Washington. The need to model vehicle occupancy simultaneously with injury severity

was motivated by the observation that the possibility of a severe injury increases with

increasing number of persons in the vehicle. Nested-logit models were developed with

occupancy as the upper-level nest and injury severity as the lower level nest. Unlike, the

19

previous efforts discussed, Chang and Mannering adopt an un-ordered discrete-choice

structure to model injury severity. The authors segmented the data into truck-involved-

and non-truck-involved- crashes and demonstrated the statistical and empirical validity

of such segmentation. For example, the results indicate that higher speeds are strongly

associated with more-severe crashes when trucks are involved (the effect was

insignificant in the case of non-truck crashes). Similarly, the effects of turning

movements (right turn and left turn) of the vehicles on the crash severity were also

found to be different. Consistent with expectations, the results also indicated that multi-

occupant vehicles in truck-involved crashes result in significantly severe injuries.

Overall, these authors argue that counter-measures aimed at reducing the severity of

truck-involved crashes could be different from those aimed at reducing the severity of

non-truck crashes.

In contrast to the previous three studies which have examined the level of injury

severity, other researchers have focused on fatal crashes involving large-trucks.

Braver et al. (1996) examined the effect of roadway geometry, weather, and other

factors on the incidence of fatal large truck-car crashes. Defiance of traffic control

devises, curves, slippery and roadway conditions were some of the conditions found to

be associated with fatal crashes.

Campbell (1991) examined the impact of driver age on the involvement in fatal

crashes. Based on nationally-representative data for the years 1980-1984, the author

developed estimates for the risk of involvement in fatal crashes as the number of fatal

crashes per hundred million vehicle miles. The analysis indicates that younger drivers

(age < 27 years) are over-involved in fatal crashes. Further, the relative risk of very

20

young drivers (less than 21 years of age) was found to be about six times the overall

risk for all drivers.

Golob et al. (1987) examined the severity (both injury severity and incident

duration) of truck-involved freeway accidents. About 9000 crashes from the years 1983-

1984 were obtained from TASAS (Traffic Accident Surveillance and Analysis System)

data base maintained by the California Department of Transportation. All data were

from the Los Angeles area. Based on the number of fatalities per accident, the ―hit-

object‖ type crashes were found to be most dangerous (0.025 fatalities per accident).

―Rear end‖ and ―other type‖ (other than hit-object, side-swipe, broad-side, and overturn)

of crashes were also very dangerous (0.021 fatalities per accident).

It is useful to mention here that past studies have also examined other aspects of

large-truck safety (other than injury severity). For example, research undertaken by

Blower et al. (1993) examined the factors affecting the crash propensities (or the risk of

being involved in a crash) and show that truck crash-rate is significantly affected by

truck configuration, location (rural or urban), traffic density, and time of day. Hallmark

(2009) focused on the incidence of a specific type of crash – the lane-departure

crashes. Using logistic-regressions, the authors identify that such crashes were more

likely to happen when driver is fatigued, upset, distracted, or unfamiliar with the

roadway. More generally, driver fatigue has been recognized as an important factor

affecting truck crashes. Based on a survey conducted in New Zealand, Gander et al.

(2006) identified 7.6% of crashes were identified as fatigue-related. The duration of the

most recent sleep period was considered as a measurement of fatigue. In consideration

of the effect of driver fatigue, the hours of service (HOS) of commercial drivers are

http://en.wikipedia.org/wiki/Hours_of_service

21

regulated by the Federal Motor Carrier Safety Administration (FMCSA) in the United

States. Commercial motor vehicle (CMV) drivers are limited to 11 cumulative hours

driving in a 14-hour period, which must then be followed by a rest period of no less than

10 consecutive hours. Drivers employed by carriers in "daily operation" may not drive

more than 70 hours within any period of 8 consecutive days (NHTSA, 2008). Although

the primary intent of this dissertation research is on injury-severity (conditional on a

crash) and not on the risk of a crash happening, insights from studies discussed above

are useful and appropriate explanatory variables (such as fatigue) will be included in our

models.

Overall, the literature on the modeling of injury-severity of large-truck crashes

appears to be limited. Past studies have focused on specific types of crashes (such as

rollover or rear-end) or on specific injury-severity levels (such as fatal crashes). Also,

the methods employed are rather simplistic. In this context, the intent of this dissertation

is to present a comprehensive analysis of the injury severity of all types of crashes

involving large-trucks. Flexible econometric structures and a comprehensive empirical

specification will be developed.

2.2 Research on Modeling Injury-Severity of Automobile Crashes

Although few studies have analyzed injury-severity in large-truck crashes, the

body of literature on modeling injury severity, in general, is extensive. A substantial

fraction of these are focused on automobile crashes and these are discussed in the rest

of this section. It is envisioned that methodological- and empirical- insights from these

past studies will inform our research on large-truck crashes. It is also useful to

acknowledge that injury-severity of motorcycle, pedestrian, and bicycle crashes have

also been studied in the past. To limit the scope of our literature review, we do not

http://en.wikipedia.org/wiki/Federal_Motor_Carrier_Safety_Administrationhttp://en.wikipedia.org/wiki/United_Stateshttp://en.wikipedia.org/wiki/United_States

22

present an extensive discussion of these studies. However, advanced methods used to

model such crashes are discussed wherever appropriate.

Table 2-1 summarizes the key features from several studies in literature on

modeling injury-severity of automobile crashes. Five important features are discussed in

separate sub-sections: (1) The Injury of Interest, (2) Levels of Injury Severity, (3) Data

Source, (4) Modeling Method, and (5) Explanatory Factors. The first three studies listed

in the table are the ones that explicitly focus on large-truck crashes (also discussed in

Section 2.1).

2.2.1 Injury of Interest

Automobile crashes could potentially involve one or more vehicles, and each

vehicle could have one or more occupants (including the driver of the vehicles). All

these occupants could have different levels of injury severity. Correspondingly, there

are differences in the injury-severity of interest across the studies presented in Table 2-

1.

In the simplest case, some studies have defined the overall severity of the crash

as the most-severe injury sustained by any person involved the crash. Alternately,

others define the injury-severity of each vehicle as the most severe injury sustained by

any person in that vehicle (Chang and Mannering (1999), Chang and Wang (2006) and

Milton et al. (2008)). Some studies have focused specifically on the injury sustained by

the driver of the vehicles (Kockelman and Kweon, 2002, Eluru and Bhat, 2007,

Yamamoto et al., 2008, Wang and Abdel-Aty, 2008, Delen et al., 2006 and Xie et al.,

2009). Others have focused on specific occupants such as driver and front seat

occupant (Newgard, 2008), and front seat and rear seat passengers (Shimamura et al.,

2005). At the other end of the spectrum are studies that have examined the injury

23

severity of each of the occupants involved in the crash. (O‘ Donnell and Connor, 1996,

Kuhnert et al., 2000, Khattak et al., 2003, Chang and Wang, 2006 and Eluru et al.,

2009).

The focus on the most-severe injury is appropriate from the stand point of data as

the injury sustained by every person involved in the crash is often not accurately

recorded (the most severe injury is generally well-recorded –Chang and Mannering

(1999)). At the same time, models of the injury sustained by every occupant involved in

the crash (subject to data availability) present a comprehensive description of the

overall severity of crashes. In light of the above discussions, this dissertation will

develop models at both the crash-level (most severe injury) and the occupant level. The

data available support such an effort and are described in detail in the next chapter.

2.2.2 Levels of Injury Severity

Injury severity is recorded in an ordinal scale. The number of categories used in

modeling range from two (high or low, Ouyang et al., 2002) to seven (no injury, minor,

moderate, serious, severe, critical, and non-survivable, Newgard, 2008). Four- and five-

categories are more common. The ―KABCO‖ is the most common scale used ( for

example, Duncan et al., 1998) with ―K‖ being the most severe category representing a

fatal crash, ―A‖ representing incapacitating injury, B representing non-incapacitating

injury, C being minor injury, and ―O‖ representing the least severe category (no

injury/property damage only). Most state- and national- crash databases use this scale.

Consistently, four- and five-level ordinal scales are most commonly used in the models

for injury severity.

The dataset used in this research provides two measures of injury severity. The

first measure is based purely on police records and the second derived from additional

24

hospital data and interviews (further details provided in the next chapter). The first

measure uses a four-level scale whereas the second uses a three-level scale. Models

will be developed using each of these measures.

2.2.3 Data Sources

Most research in injury-severity modeling has used data from national- and state-

level sources. For instance, the Crashworthiness Data System (CDS) from the National

Automotive Sampling System (NASS) was used by Wang and Kockelman (2005) and

Newgard (2008). The General Estimates System (GES) from the National Automotive

Sampling System (NASS) was used by Delen et al. (2006) and Eluru et al. (2009).

Chang and Mannering (1999) and Yamamoto et al. (2008) used the state-level

Washington State Highway Accident Records Database. Non US-data such as the

Linked Accident Database from Japan (Shimamura et al., 2005) and French database

of accident (Lapparent, 2008) have also been used.

The data to be used in this study come from the Large Truck Crash Causation

Study (LTCCS – discussed in more detail in Chapter 3). The database assembled by

this study augments the conventional crash-data obtained from police reports in several

ways. For instance, additional data related to ―human factors‖ such as the fatigue,

illness, and distraction of the drivers was collected. Historical records on the safety of

the drivers, vehicles, and carriers (past violations and citations) involved in the crashes

were also obtained and added to the crash data obtained from the police accident

reports. Thus, the database available for this study would enable the development of a

richer empirical specification.

25

2.2.4 Modeling Method

Almost all past research on injury-severity modeling has employed

statistical/econometric methods. Exceptions include the Classification and Regression

Tree (CART) method used by Kuhnert et al. (2000) and Chang and Wang (2006) and

Artificial Neural Networks used by Delen et al. (2006). While the CART and Neural

Network methods can help establish very flexible and non-linear relationships, their

value as descriptive models is limited because it can be extremely difficult to interpret

the marginal effects of various factors as implied by the estimated relationships. This

ability to obtain physical meanings of the parameters is important from the standpoint of

identifying appropriate counter-measures to reduce crash severity. Further, such

methods, unlike the statistical approaches, often do not have estimates of the strength

of correlations (i.e., the ‗p‘ values). Finally, methods like CART can be difficult to apply

with increasing number of explanatory variables. This dissertation focuses on the use of

econometric methods. In the rest of this section, such methods used in past research

for injury-severity modeling are discussed in detail.

There are two fundamental issues in the modeling of injury-severity of crashes: (1)

the treatment of the ordinal, injury-severity variable in the modeling process and (2) the

incorporation of interdependencies (error correlations) among the injury-severity of the

different persons involved in the same crash.

2.2.4.1 Treatment of ordinal

As discussed in Section 2.2.2, injury-severity is generally recorded in an ordinal

scale. Most commonly, the following five-level scale in the increasing order of injury

severity is used: Property-damage only, Possible Injury, Non Incapacitating Injury,

Incapacitating Injury, and Killed (Fatal). Thus, ordered-response discrete-choice models

26

(ordered probit or ordered logit) are appropriate for the analysis of such data. In fact,

this has been a popular approach for modeling injury severity in general (For example,

Kockelman and Kweon, 2001; O‘ Donnell and Connor, 1996).

In this approach, the observed, ordinal, injury-severity level is related to an

unobserved (latent), continuous injury propensity, which is then related to a vector of

explanatory variables corresponding to the crash via a linear-in-parameters

specification.

Wang and Kockelman (2005) and O‘ Donnell and Connor (1996) employed the

heteroscedastic variant of the ordered response model (the conventional ordered-

response model assumes homoskedasticity, or equal variances). Thus, their

specifications allowed the variance in the error term to vary systematically as a function

of certain exogenous factors such as speed, vehicle type, vehicle weight, time of day

and occupant characteristics.

In general, the maximum-likelihood procedure is used for model estimation.

However, Xie et al. (2009) adopted the Bayesian inference procedure. The authors

compared the Bayesian and the non-Bayesian methods and concluded that the results

are similar with large samples.

There are several advantages to using the ordered-response models. Such

models explicitly recognize the ordering in the levels of injury severity. The specification

is parsimonious (fewer parameters as there is only one propensity function to estimate).

Finally, the interpretations are straightforward. Generally, a positive coefficient on a

variable implies that the corresponding explanatory factor is associated with more

27

severe crashes and a negative coefficient implies that the corresponding variable is

associated with less severe crashes.

A key shortcoming of the ordered-response models is that it is restrictive in

capturing the effects of explanatory variables on the different levels of injury severity.

Specifically, if a factor is estimated to increase the probability of the most-severe injury

(i.e., fatal injury), then the specification implies that the same factor necessarily

decreases the probability of the least-severe injury (often this is the ―no-injury‖

category). However, this may not always be true. For instance, it has been shown that

airbags decrease the likelihood of fatal injuries in the event of a crash. At the same time,

the deployment of airbags can also cause minor injuries and hence decreasing the

likelihood of least-severe injuries. Simple ordered-response models cannot capture this

effect. Researchers have attempted alternate approaches to address this issue and

these are discussed in the rest of this section.

The ordered-response models can be directly extended to address this issue by

allowing for variable threshold parameters. Eluru et al. (2008) have applied such a

―mixed generalized ordered logit model‖ to study injury severity of pedestrians and

bicyclists. Another extension of the conventional ordered-response structure is the

Partial Proportional Odds Model (Wang and Abdel-Aty, 2008). This approach also

allows for the coefficients on the explanatory variables to be different across the injury-

levels. However, relative to the simpler ordered-response models, the interpretations of

the parameters from these advanced models are not straightforward.

Another approach to address the above issue, while still having easily

interpretable structures, is to use unordered specifications such as the nested-logit

28

model (for instance, Savolainen and Mannering (2007) and Abdel-Aty et al., 2003).

Such models require the specification of a utility function corresponding to each

alternative (unlike the ordered-probit models which use a single propensity function and

fixed thresholds) and, hence, overcomes, the restrictive empirical specification.

However, the use of nested-logit structure implies that the ordering of the injury-severity

levels is ignored. The nested-logit models require that each alternative belong to only

one nest, and hence, the error-correlations between adjacent levels of injury severity

cannot all be effectively captured. A particular extension of the nested-logit model that is

appropriate in the context of ordered-alternates is the Ordered Generalized Extreme

Value (OGEV) model but this structure has not been applied to injury-severity modeling.

The most flexible error-correlations across the choice alternatives can be captured by

the use of mixed-logit models (Train, 2009). With the exception of Milton et al. (2008),

these methods have not been applied to injury-severity analysis. It also appears that the

primary motivation for the above researchers to use the mixed-logit model is to capture

heterogeneous impacts of explanatory variables on the injury severity by using random

coefficients rather than to capture flexible error correlations.

A third approach to capturing the ordinality among the alternatives is to model the

ordered choice as a sequence of binary choices where each binary choice involves

choosing a specific level relative to higher (or lower levels). Dissanayake and Lu (2002)

used such a sequential binary approach. Two sequential model structures where

estimated – one in which the injury severity varied from lowest level to the highest and

the second in which the severity was varied from the highest to the lowest. However

these researchers assumed that the binary choices were independent. Yamatoto et al.

29

(2008) pointed out that the correlation existed in the error terms and was especially

stronger in the successive two levels and developed an improvement which

accommodates correlations (partially) among the successive levels.

2.2.4.2 Incorporating interdependencies among the injuries of all persons involved in the crash

Most of the injury-severity models can be classified as ―single-equation‖ models

and these do not consider the correlations among the injuries sustained by all persons

in the same crash or in the same vehicle. Few studies have used the bivariate ordered-

response structures to account for correlations among two persons involved in the same

crash. For example, Hutchinson (1986) analyzed the severity of injuries sustained by

the driver and front-seat passenger simultaneously. Yamatoto and Shankar (2004)

applied model to the driver and the most severely injured person in the vehicle. Ouyang

et al. (2002) studied rear-end crashes involving trucks and cars. The injury-level (on a

binary scale) associated with both vehicles were estimated simultaneously

Most recently, Eluru et al. (2009) used a copula-based approach to accommodate

the dependence in injury-severity propensities among the multiple occupants of the

same vehicle (the dependencies among the different vehicles in the same crash, were,

however not considered). Copulas are functions that generate stochastic-dependence

relationships (i.e., a multivariate distribution) among random variables with given

marginal distributions.

While mixed-models and error-component structures have been routinely used in

other fields (such as economics and travel-demand modeling) to estimated correlated

models, it appears that such methods have had limited applications in the context of

http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6V5S-4B0PPJY-1&_user=2139813&_coverDate=09%2F30%2F2004&_rdoc=1&_fmt=full&_orig=search&_cdi=5794&_sort=d&_docanchor=&view=c&_searchStrId=1078157190&_rerunOrigin=google&_acct=C000054276&_version=1&_urlVersion=0&_userid=2139813&md5=8d031b63098876807386c505f71c71c7#bib7

30

injury-severity analysis. This dissertation will contribute to the literature by estimating

such advanced econometric models for injury-severity.

2.2.5 Explanatory Factors

The last major column in Table 2-1 identifies the primary explanatory factors used

in past injury-severity models. These factors are classified into the following six

categories: (1) crash characteristics, (2) driver/occupant characteristics, (3) vehicle

characteristics, (4) environment characteristics, (5) roadway characteristics, and (6)

occupant protection characteristics. Each of these is discussed in the rest of the this

section

2.2.5.1 Crash characteristics

The characteristics of the crash that were estimated to influence injury severity

include time of day (such as peak or off-peak and daylight or dark), crash type (frontal,

rear-end, rollover or else), and other crash descriptives (number of vehicles involved, at-

fault driver, harmful event/causation).

NHTSA (2008) reports that the period from midnight to 3 a.m. on Saturdays and

Sundays was the deadliest of all 3-hour periods with more fatal crashes than any other

time period (NHTSA, 2008). Kockelman and Kweon (2002) also find that crashes on

Friday, Saturday and Sunday during late night (midnight to 4am) are more severe and

late night on Sundays was the most dangerous time. Analysis results from Chang and

Mannering (1999) suggest that the night time was more dangerous and rush-hour

crashes were less severe. Eluru and Bhat (2007) report that crashes occurring between

6am-7pm were less severe than crashes during other time period. Eluru et al. (2009)

also report that 12am to 6 am was the most dangerous time. Friday afternoons were

also shown to have more crashes but less fatal crashes in NHTSA (2008). It is

31

important to note that the time-of-day of the crash could be reflective of traffic volumes,

speeds, and lighting conditions. All the above results indicate that crashes during darker

and less-congested time periods are more severe than crashes during brighter and

more-congested times.

Chang and Mannering (1999) found that summer was the most dangerous season

followed by spring and autumn. Many other studies did not have statistically significant

effects of the season possibly because these effects are captured by variables

describing the road-surface condition (for instance wet/icy/snowy conditions could be

reflective of winter conditions). It is useful to note that Chang and Mannering did not

control for road-surface conditions.

Crash type is another important factor affecting injury severity. Head-on crashes

and crashes with a stationary object were most dangerous and the vehicle being struck

received higher injury severity relative to the striking vehicle (Eluru et al., 2007).

Kockelman and Kweon (2002) as well as Duncan et al. (1998) report that rollover

crashes can result in more severe injuries. Other types of crashes such as the angle

and sideswipe have not been extensively examined. Crashes that lead to fire are found

to result in more severe injuries as would be expected. As hazardous cargo can lead to

fire in the event of a crash, countermeasures aimed at improving the safety of trucks

carrying such cargo becomes more important.

The number of vehicles involved in the crash was also important from the

standpoint of injury severity. NHTSA (2008) estimates that multi-vehicle crashes were

more dangerous using aggregate data and this result was also supported by

econometric models estimated by Chang and Mannering (1999). Yamamoto and

32

Shankar (2004) also report that more passengers in the vehicle increase the severity of

the most severe injury in the accident. At the same time, these researchers also find

that increasing number of passengers in the vehicle decreases the injury severity of the

driver. Such results suggest that focusing on the most-severe injury or the injury

sustained by one of the occupants is not adequate. Rather, the injuries sustained by all

persons involved in the crash must be studied to have a comprehensive understanding

of the crash severity.

Vehicle movement at the time of crash is also a factor influencing injury severity.

Crashes while negotiating curves and passing other vehicles were shown to have

higher probabilities of fatalities compared with other kinds of movements such as

turning and merging (NHTSA, 2008). Among turning movement, left-turns might be

particularly critical because of the possibility of conflicts with opposing streams of traffic

(for instance the study on left-turn crashes at intersections by Wang and Abdel-Aty

(2008)). The impact point has also been found to determine injury severity. Based on

research by Delen et al. (2006) occupant in the vehicle that is struck is more likely to

sustain severe injuries compared to the occupants in the striking vehicle.

2.2.5.2 Vehicle characteristics

The age, size, engine, and other characteristics of the vehicle(s) involved in the

crash affect the injury severity.

The research by Kockelman and Kweon (2002) indicate that drivers of light- and

heavy-duty trucks and minivans are better protected against injuries. Yamamoto and

Shankar (2004) report that drivers of large trucks sustain less-severe injuries. Khattak

and Rocha (2003) studied sport utility vehicles (SUVs) and found that SUVs were more

likely to rollover, but its protective effect exceeded the harmful effect caused by rollover.

33

Therefore, on comparing with passenger cars, SUV occupants have less-severe injuries

in the event of a crash. Eluru and Bhat (2007) showed that drivers in sedans were more

likely to be injured heavily comparing to others in dual-vehicle crash. According to

Kockelman and Kweon (2002), in two-vehicle crashes, heavy-duty trucks result in more

severe injury for the driver of the partner vehicle. Consistent results were achieved from

other research focusing on effect of vehicle types. In general, it appears that occupants

of heavier vehicles often have less-severe injuries. At the same time, if heavier vehicles

are involved in the crash, the overall severity of the crash could be higher because of

greater injuries to the occupants of the other vehicle(s) involved in the crash.

Some other factors such as vehicle age have also been studied. For example,

Yannis et al., 2005 report that older vehicles (age > 35 years) are involved in more

severe crashes. Khattak et al. (2003) report higher injury-severity to be associated with

large trucks manufactured before 1992.

2.2.5.3 Driver and occupant characteristics

As already discussed, many studies have focused on the injury sustained by the

drivers of vehicles. NHTSA (2008) estimates that drivers comprised 63% of the total

persons injured or killed in crashes and passenger accounted for only 28%. Therefore,

the substantive focus on drivers seems appropriate.

The age of the drivers and the vehicle occupants has been found to be strongly

related to injury severity. Wang and Abdel-Aty (2008) reported that very young

(age≤19) and young drivers (19 < age ≤24) were more likely to sustain severe injuries.

Based on fatalities and injured rates per 100,000 population (NHTSA, 2008), men aged

21-24 and women aged 16-20 had the highest fatality rates, while both males and

females aged16-20 had the highest injury rates. Eluru et al. (2009) report that younger

34

drivers (16-20) are less likely to be severely injured comparing with driver over 65 years

if age. Overall, there appears to be a non-linear effect of age on injury severity with the

youngest and the oldest being susceptible for more severe injuries (arguably for very

different reasons) compared to the middle-aged.

Eluru et al (2009) also examined the effect of driver age on the injuries sustained

by other occupants in the vehicle. Driver over 45 years old were estimated to be

associated with more severe injury to front seat passengers, but driver‘s age did not

significantly affect the rear passenger. Children less than 5 years old were less likely to

be highly injured when seating in the rear, and passengers over 65 are more likely to be

severely injured.

Gender is also expected to influence injury severity. Based on, aggregate,

national-level crash data from 2007 (NHTSA, 2008), men had a significant higher rate of

being killed or severely injured compared to women. However, econometric models

developed by Eluru and Bhat (2007) and Kockelman and Kweon (2002) suggest that

men suffer less severe injuries compared to women, after controlling for several other

factors that affect crash severity.

Alcohol is another important factor studied. There were 12,998 alcohol-impaired

driving fatalities in 2007 which accounted for 32% of all traffic fatalities for the year.

Among the fatal crashes occurring from midnight to 3 a.m., 65% involved alcohol-

impaired driving. All studies on alcohol (Duncan et al., 1998 and Eluru and Bhat, 2007)

confirmed that drivers under influence of alcohol were likely to be more-severely injured

than others and Eluru et al. (2009) also estimated that the alcohol consumption of the

35

drivers also affected the injury severity of the passengers. It is useful to note here that

alcohol records for passengers are not required in Police Accident Reports (PAR).

Driver fatigue is another critical aspect that affects injury severity. Gander et al.

(2006) reported that 41%-71% of the truck crashes were related to fatigue. Srinivasan

(2003) estimated that fatigued drivers were five times as likely to be serious injured and

faced a 30% lower chance of experiencing a property damage only (PDO) event.

Sleepy drivers were more likely to be involved in more-severe crashes (Khattak et al.,

2003). Fatigue and sleeping habits can be expected to be even more important

attributes of truck drivers compared to car drivers.

Speeding was a main causation of crashes and contributed 31% of all fatal

crashes in 2007 (NHTSA, 2008). For drivers involved in fatal crashes, young males

were confirmed to be more involved in speeding and speeding was clearly a deadly

combination with alcohol.

Data on drivers‘ history related to traffic violations have also been studied. Drivers

with violation history faced about 15% and 22% increase in the probability of moderate

and severe injuries (Srinivasan, 2003).

For all the passengers in a vehicle, the seating position is also important. O‘

Donnell and Connor (1996) obtained that the left-rear seating position is the most

dangerous seating position. Newgard (2008) concluded that seat position has a strong

correlation with passenger‘s age.

2.2.5.4 Environmental characteristics

Environmental factors affecting injury severity include weather and light conditions.

In general bad weather has been found to be associated with lower injury severity

(Duncan et al., 1998 and Yamamoto and Shankar, 2004). This is possibly because

36

drivers are inherently more cautious and do not speed during bad weather (Eluru and

Bhat, 2007). Weather conditions are also reflective of the road surface conditions (for

example, rainy weather also leads to slippery road surface). The snow intensity and

wind gust speed were studied by Khattak and Knapp (2001). The results indicated that

higher wind gusts during snow events tended to result in more-severe injuries. The

negative effect of snowfall intensity on injury level was explained as greater snow

accumulation due to higher snowfall intensity acted as an attenuator.

Light condition has more complex results than weather. Wang and Kockelman

(2005) concluded that lack of light decreased the severity in one-vehicle crashes, but

increased the injury severity in two-vehicle crashes. Driver was considered to be less

injured in dusk or dark with lighting by Eluru and Bhat (2007) given the same reason as

bad weather. Huang et al. (2008) consisted that crashes at night were more serious

than those during daytime and a bad street lighting condition could increase the odds of

severe crash by about 69%.

Given the strong interdependencies between weather, lighting conditions, and

driving behavior (drivers are inherently more cautious during bad weather and bad

lighting conditions and perhaps the converse is true under good weather and good

lighting conditions), additional research on effectively disentangling the marginal effects

of these correlated factors is needed.

2.2.5.5 Roadway characteristics

Roadway characteristics include roadway design, location, surface condition,

traffic control and traffic volume.

Because speed is an important factor of severe injury and the actual speed is

seldom recorded, the speed limit is often used as an important proxy variable in injury-

37

severity studies. As commonly admitted, roads with higher speed limits have a higher

injury severity. The medium-to-high speed limit (26-64mph) was most dangerous and

the high speed limit (>65mph) cause more severe injury than low speed limit roads

(45 mph) was

considered to be dangerous to car drivers at intersection and to both car and truck

drivers at straight section.

Speed limit is also indicative of the location of the crash, such as interstate

highway, or state road. High speed exists on good road condition, mainly on highway,

and lower speed may exist on rural road and ramp on the highway. Chang and

Mannering (1999) estimated that rural area was more dangerous than urban area.

Furthermore, road classification, curve, uphill or down hill also influenced the injury

severity combined with the speed limit. The result from Wang and Kockelman (2005)

stated that crashes on curves were more severe, while uphill increased the injury level

in one-vehicle crashes but decreased the injury in two vehicle crashes. Road signs also

contributed to the traveling speed, such as flash light, electronic display for the traffic

condition, and sign for the camera. The validity of these speed control methods was

studied (Alicandri and Warren, 2003, Sarasua et al., 2006).

Another crucial factor of injury is the traffic condition, which is usually a temporal

factor. Significantly, traffic varies by peak hours and off peak hours of a day. Night time

has less traffic and higher speed. Without considering special occasions, such as

38

congestion caused by incident, sport game or work zone, traffic is shown by the time of

the crash. Special traffic conditions causing trouble in traffic also cause probability of

crashed and injuries but these records are not available in most police record data.

Higher truck percentages on the roadway were shown to decrease the injury

severity of accidents (Milton et al. 2008) and this was explained by slowing effect on

travel speeds of the sheer number of trucks.

2.2.5.6 Occupant protection

Occupant protection mainly refers to seat belts, airbags.

NHTSA (2008) estimated that 15,147 lives were saved in 2007 by the use of seat

belts. Seat belt use was studied by Eluru and Bhat (2007), which indicated that men,

younger individuals (age

39

which an airbag was not deployed. This was because airbag deployed when the vehicle

was struck violently and the passenger was shocked seriously.

2.3 Contribution of this Dissertation

The discussions thus far clearly highlight the value of improving truck-safety and

studies conducted to date on identifying appropriate countermeasures. In light of the

above discussions, the following are the main empirical and methodological

contributions of this dissertation.

Empirical Contributions:

The focus of this research is on large-truck crashes. Despite the unique aspects of

truck-crashes and the importance of safe trucking to the economy of the country,

research on the factors affecting injury-severity has been limited. This study contributes

by using a recently-assembled national-level database to develop econometric models

to relate injury-severity to a wide variety of explanatory factors. A rich empirical

specification will be developed that incorporates the marginal effects of several

explanatory factors including crash characteristics, driver characteristics, vehicle

characteristics, environment and roadway characteristics.

Methodological Contributions:

Almost all past research in injury-severity modeling has used relatively simpler and

restrictive methods for analysis. For example, a key shortcoming of past research is

that the interdependencies among the injuries sustained by the different persons

involved in the same crash have largely been ignored. These will be explicitly

incorporated in our models leading to more comprehensive descriptions about the

overall severity of multi-person multi-vehicle crashes. A second key shortcoming of the

popularly-used ordered-response models is that it is restrictive in capturing the effects of

40

explanatory variables on the different levels of injury severity. Specifically, if a factor is

estimated to increase the probability of the most-severe injury (i.e., fatal injury), then the

specification implies that the same factor necessarily decreases the probability of the

least-severe injury (often this is the ―no-injury‖ category). This research will use

advanced methods such as the OGEV models to develop flexible empirical structures.

41

CHAPTER 3 DATA

In this chapter, we first present an overview of the source and characteristics of

the LTCCS data. Then, the detailed procedures for data cleaning and variables

reduction are stated and the description of sample data is presented at the end of this

chapter.

3.1 Data Source and Raw LTCCS Data Characteristics

This study uses data from the Large Truck Crash Causation Study (LTCCS).

These data represent a sample of large-truck crashes that occurred between April 2001

and December 2003. Data on approximately a thousand crashes were collected from 24

sites in 17 states. Each crash in the LTCCS sample involves at least one large truck

and resulted in at least one injured. These data were collected by the Federal Motor

Carrier Safety Administration (FMCSA) and the National Highway Traffic Safety

Administration (NHTSA) of the U.S. Department of Transportation (DOT).

USDOT/FMCSA (2006) and Hedlund and Blower (2006) provide an overview of the

data. The full data and related documentation are available for download from the

LTCCS website at: http://ai.fmcsa.dot.gov/ltccs/default.asp.

The raw database has over a thousand data fields and is organized into 58

different files. The data sample includes 1,070 cases with 2,284 vehicles (including

1,141 large trucks and 1,043 passenger vehicles) and 3,014 occupants (including

drivers and passengers). The crashes resulted in 251 fatalities and 1,499 injuries. Injury

severity is categorized into no injury, possible injury, non-incapacitating injury,

incapacitating injury, and killed.

http://ai.fmcsa.dot.gov/ltccs/default.asp

42

Elements that influence the severity are drawn from 43 data files and categorized

into Injury Severity, Crash Description, Characteristics of the Occupant, Characteristics

of the Driver (demographics, fatigue, history of crashes/violations, behavior, health,

drugs/alcohol), Characteristics of the Vehicle (physical attributes, cargo, history of

crashes/violations), Characteristics of the Environment (roadway features, weather) and

Characteristics of the Carrier.

Furthermore, the dataset can be divided to levels of occupant, vehicle, truck, crash

and carrier to obtain better description and analysis of the factors since it is collected by

several departments focusing on different fields, and some records are not complete

enough for all the crashes.

3.2 Sample Formation Procedure

Extensive data processing was undertaken which involved cleaning, consistency

checks, and variable selection (this was particularly important given the significant

correlations observed among the different variables in the database) using the SPSS

statistical software.

3.2.1 Selecting Cases

Some cases in the raw data are from the pilot phase of the project or found to not

meet the selection criteria. RATWeight, is used as a weight to produce nationally

representative estimates, and zero weight cases should be dropped first. After dropping

107 zero weight cases, there remain 963 cases for further processing.

3.2.2 Cleaning and Consistency Checking

Because of the existence of useless variables and inconsistent cases, the

procedures of cleaning and consistency checking are used for sample formation to keep

the data effective and consistent.

43

Sample cleaning starts from the frequency test for all factors, because a variable

is invalid if over 99% of the variable is distributed as the same value. Therefore, for

each category, we test the frequency at corresponding levels, e.g., injury severity at

occupant level, crash and environment description at crash level, characteristics of the

occupant for car and truck occupant separately, characteristics of the vehicle and driver

at car and truck level, and characteristics of the carrier for available carrier records, and

drop the binary variables with less than 1% or more than 99% present cases.

The consistency checking includes two steps. First, we check variables within

each category, such as crash type, driver and occupant number, time of day and

daylight. Second, we check across the category, for example, aggregated vehicle

record and total number of vehicles, crash type across crash and vehicle level, number

of vehicles and crash type. The results are shown to be mostly consistent. The number

of inconsistent cases is not large enough to affect the analysis result. After correcting

some apparent inconsistent cases according to the variables recorded in detail, we

dropped the other 10 inconsistent cases. The sample size by each level is shown in

Table 3-1 and the variables after cleaning and consistency checking from raw data are

displayed in Appendix.

3.2.3 Variable Selection

The number of variables is limited by the sample size. Variable selection can

merge the correlated variables and reduce the number of variables in the model so that

to simplify the modeling procedure and reduce the running period.

3.2.3.1 Crosstab check

If two or more variables describe the same or correlated field of a subject, they are

expected to be correlated. Taking the roadway traffic ( a category) and the restriction (b

44

category) as an example, restriction on the road is correlated with the situation of the

traffic condition. Merging the two variables into one and re-combining the values

reasonably can reduce the ba categories to bac . The same situation exists for

other variables, such as seatbelt use, eye vision, carrier status and etc.

3.2.3.2 Classification analysis

When dealing with over three correlated or similar variables, since cross tabulation

analysis may be required several times, classification analysis can be applied to reduce

the variables easily. In the cleaned dataset of crash characteristics, seven variables are

used to describe weather condition before and at the time of crash (ENVRain,

ENVSnow, ENVFog, AFTRain, AFTSnow, AFTFog, and Weather). Hierarchical cluster

command can merge these seven binary variables into clusters by distance or similarity

measures. The cluster output implies that variables of ENVRain, ENVFog, AFTRain and

AFTFog can be aggregated into one group and ENVSnow and AFTSnow into another

group. For continuous variables such as driver height and weight, after grouping the

cases by each ten centimeter and kilogram, there are still over 6 groups for these two

variables. The K-mean cluster method can be used to group the cases into clusters

using the distribution of these two variables.

3.2.3.3 Missing data

There were several variables (particularly those describing driver and vehicle

characteristics) of potential interest that had missing values for a significant fraction of

the cases. Further, it was also found that these values were ―systematically missing‖ for

many of the variables. For example, the value of certain variables could be more likely

to be missing for crashes with greater severity. Alternatively, the value of other variables

could be more likely to be missing for crashes with lesser severity. Consequently,

45

simply removing the cases with missing values would skew the sample (in addition to

reducing the sample size). Therefore, we chose to retain all cases, and the indicator

variables were created to explicitly identify cases with missing values for each of the

variables of interest. These indicator variables were also included in the model

specifications and some of these turned out to be statistically significant (discussed

further in the next section).

Missing data in this dataset is a common situation. To deal with missing data, it is

better to start from the types of missing, where, proper methods are required to process

the data.

Harrell (2001) presented the missing data as three types: Missing completely at

random (MCAR), Missing at random (MAR) and Informative missing (IM).

Missing completely at random (MCAR) refers to the situation that data elements

are missing for reasons that are unrelated to any characteristics or responses for the

subject. Examples include a subject omitted the response to a question for reasons

unrelated to the response she made to her characteristics (miss the question), loosing

the data because of a mistake in the experiment. This kind of missing data will be in

small numbers.

Missing at random (MAR) does not mean that data elements are missing at

random, but the probability that a value is missing depends on values of variables that

were actually measured and does not depend on the unobserved data. An example is

the large truck weight is not all recorded because not all the carriers provide this

information, but the carrier information is recorded.

46

Informative missing (IM) is missing elements more likely when their true values of

the variable are systematically higher or lower. For example, a heavily injured driver

would have difficulty in helping record the driving information, such as speed, emotion

or road familiarity. IM is also called nonignoreable nonresponse.

In the studied dataset, large numbers of variables have values recorded as

unknown. When this variable is a record for the crash type, location and some other

existing situation, this is a most common missing data completely at random (MCAR) in

the dataset. The unknown values are not huge and ignorable. Usually, join these

unknown values with some other low frequency values together as a new value. If the

values are under 1%, join in any group will not affect the result significantly. For

unknown value over 1%, it can be recoded to other values by linear interpolation or

mean. This procedure can reduce the values for the variables.

Another kind of ―unknown‖ or missing record is not MCAR and is nonignoreable.

Such as alcohol exist at crash level, 11.9% report as unknown. This is maybe caused

by the disability to make a record for high severe injured people or carelessness when

recording a no injury crash. In the crash report, there are also huge missing records,

such as the cargo weight, driver behavior and etc., recorded as unknown. For these

types of missing data, keep the unknown as a value and model the impact of unknown

on the response of injury.

The same unknown situation can happen in the records of individual age, gender,

speed limit and etc. But if the count of unknown for the variable is not large, it is not

necessary to keep the unknown as a special value. For the low frequency (2%)

47

unknown value, consider it as ignorable and replace the unknown value by mean,

median or linear trend.

3.3 Sample Characteristics

The final ―reduced‖ dataset assembled for this analysis was organized into the

three major files: Crash-level data (including highest level of injury severity in the crash,

crash type, and environment variables), Vehicle-level data (characteristics of the trucks

and cars such as age, body-type, cargo, occupancy levels, and deficiencies), and

Driver-level data (characteristics of the truck and car drivers including demographics,

fatigue, health, and behavior). The effects of all these different variables were examined

during the statistical-modeling procedure.

At the crash level, we have 953 cases in the estimation sample. For occupant

injury study, we select 918 representative crashes, which have at most 3 trucks, at most

3 cars and at most 5 occupants per vehicle (including the driver). Totally, there are 2374

occupants in the selected sample, including 1038 truck drivers, 145 truck occupants,

818 car drivers, and 373 car passengers.

Table 3-1. Sample size for each category.

Variable category Level Sample size cases Injury severity Occupant 2699

Crash

Vehicle 2056

Crash 953

Truck and truck driver Truck 1108

Car and car driver Car 941

Truck occupant Truck occupant 151

Car occupant Car occupant 499

Carrier Truck carrier 733

CHAPTER 4 MODELS FOR CRASH LEVEL INJURY

The intent of this chapter is to present a comprehensive analysis of the ―injury

severity‖ of all types of crashes involving large-trucks. Because this is the first step of

this dissertation, as many variables as possible are estimated to understand the

relationship between injury severity and a vast number of inter-dependent explanatory

factors using crash level data. The injury severity of a crash is defined as the highest

level of severity among all those injured in the crash.

4.1 Sample Data

There are 953 crashes in the estimation sample. The data were collected from

both police accident reports and additional sources such as site investigation,

interviews, and review of medical records conducted by the LTCCS team. Each crash

was investigated by a two-person team comprising a trained researcher and a state

truck inspector (USDOT/FMCSA, 2006 for an overview). Correspondingly, there are two

measures of injury severity available for each crash: (1) a measure determined from the

police accident reports (referred to as PAR) and (2) a measure determined by the

LTCCS researchers based on medical records and case narratives (referred to as RES

for researcher-determined).

A cross tabulation of the highest injury severity level of the 953 crashes

determined from each of PAR and RES is presented in Table 4-1. In the case of PAR,

the severity was measured on a four-item scale: possible injury (14%), non

incapacitating injury (32%), incapacitating injury (32%), and killed (22%). In the case of

RES, the severity was measured on a three item scale: non incapacitating injury (48%),

49

incapacitating injury (29%), and killed (23%). This is because the crashes in the LTCCS

were sampled such that there is at least one injured person on the RES-scale.

The cross-tabulation indicates that fatal crashes are recorded almost identically by

both measures (approximately 22% each). However, discrepancies are observed in the

case of non-fatal crashes. About 12% of all crashes (119 crashes) that were classified

as level C by PAR were classified as level B by RES indicating under-estimation of the

injury severity by the police reports (consistent with the findings of others such as Tsui

et al., 2009). At the same time, 103 crashes classified as level A by the PAR were

classified as level B by the RES indicating an over-estimation by the police reports.

Given these discrepancies, it is useful to compare models estimated using each of

these severity measures. Such an effort is undertaken in this study.

It is also important to note that the sampling of the crashes was not purely random.

Therefore, weights have been calculated to scale the sample to be nationally

representative (the RATWEIGHT variable described in user‘s manual by USDOT,

2006). Table 2 also presents the weighted percentages of the injury-severity levels

according to the two measures (the last row and last column). The results indicate that

crashes with higher levels of severity (particularly fatal crashes) have been

oversampled.

50

Table 4-1. Cross tabulation of police-determined and researcher–determined injury severity levels.

RES

Possible Injury

Non-incapacitating

Injury

Incapacitating Injury

Killed Total Percentage Weighted

Percentage

PAR

Possible Injury 0 119 15 0 134 14.06 14.39

Non-incapacitating

Injury 0 239 61 2 302 31.69 31.32

Incapacitating Injury

0 103 192 10 305 32.00 45.90

Killed 0 1 5 206 212 22.25 8.40

Total 0 462 273 218

Percentage 48.48 28.65 22.88

Weighted Percentage

55.00 36.52 8.48

62

4.2 Methodology

Injury severity is generally recorded in an ordinal scale. Most commonly, a five-level

scale in the increasing order of injury severity is used: Property-damage only, Possible

Injury, Non Incapacitating Injury, Incapacitating Injury, and Killed (Fatal). Thus, an

ordered-response discrete-choice model is appropriate for the analysis of such data. In

fact, this has been a popular approach for modeling injury severity in general (For

example, Kockelman and Kweon, 2001). In this approach, for any crash n, the observed,

ordinal, injury-severity level

By XIAOYU ZHU - University of Floridaufdcimages.uflib.ufl.edu/UF/E0/04/27/14/00001/zhu_x.pdfXIAOYU...

Documents

Transcript of By XIAOYU ZHU - University of Floridaufdcimages.uflib.ufl.edu/UF/E0/04/27/14/00001/zhu_x.pdfXIAOYU...