By XIAOYU ZHU - University of Floridaufdcimages.uflib.ufl.edu/UF/E0/04/27/14/00001/zhu_x.pdfXIAOYU...
Transcript of By XIAOYU ZHU - University of Floridaufdcimages.uflib.ufl.edu/UF/E0/04/27/14/00001/zhu_x.pdfXIAOYU...
-
AN ANALYSIS OF INJURY SEVERITIES OF LARGE-TRUCK CRASHES
By
XIAOYU ZHU
A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY
UNIVERSITY OF FLORIDA
2011
-
2
© 2011 Xiaoyu Zhu
-
3
To my parents
-
4
ACKNOWLEDGMENTS
I would like to take this opportunity to thank my parents. They are always doing
their best to provide me opportunities to pursue my goal. Their endless support and
encouragement lead me through every step in my life.
I would also like to thank the faculty members at the University of Florida (UF),
who have provided me the huge amount of knowledge and skills during my Ph.D. study.
I would like to thank my advisor Dr. Siva Srinivasan for constantly being a source of
inspiration. He provided not only the foundation of this research, but also approaches to
a successful research, which will be a lifelong benefit. I would like to thank Dr. Lily
Elefteriadou for being one of my outstanding examples as women professors, Dr. Scott
Washburn for his positive attitude towards life and work, and Dr. Yafeng Yin for his
constant enthusiasms and innovations in the research. I also would like to thank Dr.
Chunrong Ai and Dr. Trevor Park for their helpful comments from various perspectives
to make this research in a broader context.
Special thanks go to Neng Fan for his support and accompanying in both my life
and study during these four years. I would also like to thank all my friends and
colleagues for making my life in Gainesville enjoyable.
-
5
TABLE OF CONTENTS page
ACKNOWLEDGMENTS .................................................................................................. 4
LIST OF TABLES ............................................................................................................ 8
LIST OF FIGURES ........................................................................................................ 10
ABSTRACT ................................................................................................................... 11
CHAPTER
1 INTRODUCTION .................................................................................................... 13
1.1 Background: Traffic Safety and Large Trucks ................................................... 13 1.2 Objectives of the Research ............................................................................... 15 1.3 Organization of the Document .......................................................................... 16
2 LITERATURE REVIEW .......................................................................................... 17
2.1 Research on Large-Truck Crashes ................................................................... 17 2.2 Research on Modeling Injury-Severity of Automobile Crashes ......................... 21
2.2.1 Injury of Interest ....................................................................................... 22 2.2.2 Levels of Injury Severity .......................................................................... 23 2.2.3 Data Sources ........................................................................................... 24 2.2.4 Modeling Method ..................................................................................... 25
2.2.4.1 Treatment of ordinal ....................................................................... 25 2.2.4.2 Incorporating interdependencies among the injuries of all
persons involved in the crash ................................................................. 29 2.2.5 Explanatory Factors................................................................................. 30
2.2.5.1 Crash characteristics...................................................................... 30 2.2.5.2 Vehicle characteristics ................................................................... 32 2.2.5.3 Driver and occupant characteristics ............................................... 33 2.2.5.4 Environmental characteristics ........................................................ 35 2.2.5.5 Roadway characteristics ................................................................ 36 2.2.5.6 Occupant protection ....................................................................... 38
2.3 Contribution of this Dissertation ........................................................................ 39
3 DATA ...................................................................................................................... 41
3.1 Data Source and Raw LTCCS Data Characteristics ......................................... 41 3.2 Sample Formation Procedure ........................................................................... 42
3.2.1 Selecting Cases ...................................................................................... 42 3.2.2 Cleaning and Consistency Checking ....................................................... 42 3.2.3 Variable Selection ................................................................................... 43
3.2.3.1 Crosstab check .............................................................................. 43
-
6
3.2.3.2 Classification analysis .................................................................... 44 3.2.3.3 Missing data ................................................................................... 44
3.3 Sample Characteristics ..................................................................................... 47
4 MODELS FOR CRASH LEVEL INJURY ................................................................. 48
4.1 Sample Data ..................................................................................................... 48 4.2 Methodology ..................................................................................................... 51 4.3 Empirical Results .............................................................................................. 52
4.3.1 Crash-level Variables ........................................................................ 53 4.3.2 Truck-level Variables ......................................................................... 55 4.3.3 Car-level Variables ............................................................................ 58
4.4 Contributions ..................................................................................................... 65
5 THE PANEL HETEROSKEDASTIC ORDERED PROBIT MODEL FOR OCCUPANT-lEVEL INJURY SEVERITY STUDY ................................................... 67
5.1 An Exploratory Analysis of Occupant-level and Crash-level Injury Severities ... 67 5.2 Sample Data ..................................................................................................... 75 5.3 Methodology ..................................................................................................... 76 5.4 Empirical Results .............................................................................................. 79
5.4.1 Truck Occupants ............................................................................... 81 5.4.2 Car Drivers ........................................................................................ 83 5.4.3 Car Passengers ................................................................................ 85
5.5 Contributions ..................................................................................................... 93
6 THE PANEL HETEROSKEDASTIC ORDERED GENERALIZED EXTREME VALUE MODEL IN INJURY SEVERITY STUDY .................................................... 95
6.1 Background ....................................................................................................... 95 6.2 Methodology ..................................................................................................... 97 6.3 Empirical Result ................................................................................................ 99
6.3.1 Truck Occupants ............................................................................. 101 6.3.2 Car Drivers ...................................................................................... 102 6.3.3 Car Passengers .............................................................................. 104 6.3.4 Application and Sensitivity Testing .................................................. 105
6.4 Contributions ................................................................................................... 108
7 SUMMARY AND CONCLUSIONS ........................................................................ 121
7.1 Contributions of the Dissertation ..................................................................... 122 7.1.1 Methodological Contributions .......................................................... 122 7.1.2 Empirical Contributions ................................................................... 123
7.2 Directions for Further Research ...................................................................... 126
APPENDIX DESCRIPTIONS FOR THE VARIABLES ................................................. 128
LIST OF REFERENCES ............................................................................................. 136
-
7
BIOGRAPHICAL SKETCH .......................................................................................... 141
-
8
LIST OF TABLES
Table page 3-1 Sample size for each category. .......................................................................... 47
4-1 Cross tabulation of police-determined and researcher–determined injury severity levels. .................................................................................................... 50
4-2 Empirical model results: effects of crash-level variables. ................................... 62
4-3 Empirical model results: effects of truck-level variables. .................................... 63
4-4 Empirical model results: effects of car-level variables. ....................................... 64
5-1 List of possible combinations of occupant‘s injury .............................................. 74
5-2 Factors affecting the injury severity of truck occupants. ..................................... 89
5-3 Factors affecting the injury severity of car drivers. .............................................. 90
5-4 Factors affecting the injury severity of car passengers. ...................................... 91
6-1 Model comparison. ........................................................................................... 110
6-2 Standard deviation of intra-vehicle correlation term. ......................................... 110
6-3 Factors affecting the injury severity of truck occupants. ................................... 111
6-4 Factors affecting the injury severity of car drivers. ............................................ 112
6-5 Factors affecting the injury severity of car passengers. .................................... 113
6-6 List of sensitivity test. ........................................................................................ 114
6-7 Sensitivity test: effect of airbags on the injury severity of truck drivers in truck-car head on crashes. ........................................................................................ 114
6-8 Sensitivity test: effect of truck driver behavior on the injury severity of truck drivers in truck-car head on crashes. ................................................................ 115
6-9 Sensitivity test: effect of seatbelts on the injury severity of truck drivers in truck-car head on crashes. ............................................................................... 115
6-10 Sensitivity test: effect of crash type on the injury severity of truck drivers. ....... 116
6-11 Sensitivity test: effect of airbag deployment on the injury severity of car drivers in truck-car head on crashes. ................................................................ 116
-
9
6-12 Sensitivity test: effect of airbag availability on the injury severity of car drivers in truck-car head on crashes. ........................................................................... 117
6-13 Sensitivity test: effect of alcohol on the injury severity of car drivers in truck-car head on crashes. ........................................................................................ 117
6-14 Sensitivity test: effect of crash type on the injury severity of car drivers. .......... 118
6-15 Sensitivity test: effect of airbag deployment on the injury severity of car passengers in truck-car head on crashes. ........................................................ 118
6-16 Sensitivity test: effect of airbag availability on the injury severity of car passengers in truck-car head on crashes. ........................................................ 119
6-17 Sensitivity test: effect of car driver distraction on the injury severity of car passengers in truck-car head on crashes. ........................................................ 119
6-18 Sensitivity test: effect of truck driver DUI on the injury severity of car passengers in truck-car head on crashes. ........................................................ 120
A-1 Injury severity characteristics............................................................................ 128
A-2 Crash characteristics (Crash level). .................................................................. 128
A-3 Crash characteristics (Vehicle level). ................................................................ 129
A-4 Truck driver characteristics. .............................................................................. 131
A-5 Car driver characteristics. ................................................................................. 132
A-6 Truck characteristics. ........................................................................................ 133
A-7 Car characteristics. ........................................................................................... 133
A-8 Truck occupant characteristics. ........................................................................ 134
A-9 Car occupant characteristics. ........................................................................... 134
A-10 Carrier characteristics. ...................................................................................... 135
-
10
LIST OF FIGURES
Figure page 5-1 A cross-tabulation of cumulative injury severity and highest injury severity ........ 69
5-2 Distribution of the cumulative injury severities by highest injury severity ............ 69
5-3 Cross tabulations of cumulative injury cost against the highest injury severity ... 70
5-4 Cross tabulation of average injury severity against number of occupants by HIC value ............................................................................................................ 73
5-5 Distribution of injury severity levels by occupant type ......................................... 75
-
11
Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy
AN ANALYSIS OF INJURY SEVERITIES OF LARGE-TRUCK CRASHES
By
Xiaoyu Zhu
May 2011
Chair: Sivaramakrishnan Srinivasan Major: Civil Engineering
Traffic crashes have become one of the largest public health problems in the world
and will be one of the most concerned transportation issues in the future. The
importance of trucking to freight logistics and, consequently, its impact on the economic
well being of a nation is well acknowledged. There is a need for studying crashes
towards improving the safety of the transportation system, educating driver behavior,
enhancing carrier operation and incident cost reduction. Data from the Large Truck
Crash Causation Study (LTCCS) is used in the empirical analysis.
The goal to develop econometric models of injury-severity in large-truck crashes is
accomplished in a three-step procedure. The first step of this dissertation contributes
towards that end by undertaking the relationship between injury severity and a vast
number of inter-dependent explanatory factors using a crash level sample. The injury
severity is modeled using both police-reported and researcher-determined scales. The
results indicate the strong impacts of several Crash-, Truck, and Car-level variables on
the severity of the crashes.
-
12
Then we proceed to the occupant level study in the second step because the
highest injury severity cannot fully represent the severity of the whole crash. The
methodology of incorporating the effects of common unobserved factors (error
correlations) affecting the injury-severity of all persons involved in the same vehicle and
crash is developed. Both the intra-crash and intra-vehicle correlations are confirmed to
be important in the second step.
A more advanced and flexible structure of methodology is explored in the last step.
This approach is attractive as it recognizes the ordered nature of the choice alternatives
and, at the same time, it is not constrained by the ―proportional odds‖ or ―parallel line‖
restrictions of the ordered probit. The results indicate that the variables which are not
significant in ordered probit model may have impact on the injury severity. For different
roles (truck occupant, car driver and car passenger), the significant driver behavior
variables are also different.
In summary, the advanced and flexible methodologies for occupant level injury
severity study are developed and compared in this dissertation. The results and
implications are useful from the standpoints of traveler, transportation engineer and
policy maker.
-
13
CHAPTER 1 INTRODUCTION
This chapter motivates the need to study large-truck crashes and outlines the
objectives of the dissertation. The organization of the rest of this document is also
presented.
1.1 Background: Traffic Safety and Large Trucks
According to the World Health Organization, more than a million people are killed
on the world‘s roads each year (Leonard, 2004). In the United States, 26,689 occupants
(drivers and passengers) died and an additional 2,120,000 were injured in the
5,811,000 crashes in 2008 according to the National Highway Traffic Safety
Administration, (NHTSA, 2009). Clearly, these numbers highlight that traffic safety is
one of the critical public-health and transportation problems in the world.
Among all motor-vehicle traffic crashes, the focus of this study is on crashes
involving large-trucks. A large truck is defined as a commercial vehicle weighing more
than 10,000 lbs. The importance of trucking to freight logistics and, consequently, its
impact on the economic well being of a nation is well acknowledged in literature.
Specifically, based on the 2007 Commodity Flow Survey results, among all the modes,
trucks moved 74.3% of all freight by value, 67.2% by weight, and 40% by ton-miles
(USDOT/BTS, 2004). These large volumes of truck traffic, the unique operating
characteristics of the trucks and drivers, and the design and weight of trucks have
resulted in large numbers of crashes, injuries, and fatalities.
In 2005, over 5000 people died and an additional 114,000 were injured in the
442,000 large-truck crashes in the United States. Approximately 12% of all traffic
fatalities involved a large-truck crash (NHTSA, 2006). In 2007, 413,000 large trucks
http://en.wikipedia.org/wiki/World_Health_Organization
-
14
were involved in traffic crashes resulting in 4,808 fatalities, which are 12% of the total
fatality (NHTSA, 2008). Large trucks account for approximately 4% of all the vehicles
but are about 8% of vehicles in fatal crashes. 75% of the fatalities that resulted from
crashes involving large trucks were occupants of other vehicles. In addition to all the
above cross-sectional statistics, time-series trends reported by Lyman and Braver
(2003) are also illuminating. Based on aggregate data from 1975 to 1999, these authors
find that the involvement of large-trucks in fatal crashes per truck vehicle-mile-traveled
has decreased. However, with a corresponding increase in the volume of truck travel,
the involvement per unit population has not seen the same declining trend. Thus, there
is continued public concern about large-truck crashes.
Of all the injury- and fatal- crashes in 2008, 566,554 (34%) were single-vehicle
crashes and 1,097,463 (66%) were multi-vehicle crashes. This indicates that in at least
2/3rd of the injury- and fatal- crashes, more than one person is involved (even single-
vehicle crashes can have more than one person involved). Among all crashes with at
least one injury, there are, on an average, 1.29 persons injured or killed per crash.
These statistics indicate that a large number of crashes involve more than one person
and in many cases from multiple vehicles. Despite these results, a vast majority of
literature on the crash severity has focused only on the highest–level of severity rather
than on the injuries sustained by the different persons involved in the crash. Arguably,
one of the major reasons for the state-of-practice approach is that the highest severity
of the crash is more reliably recorded than the severities sustained by individual
persons (Chang and Mannering, 1999).
-
15
The above statistics clearly underscore the need for studying large-truck crashes
towards improving the safety of the transportation system. The results from such studies
will be valuable in transportation policy, improvement of carrier operation, and incident-
cost reduction. The broad goal of this dissertation is to contribute towards that end.
Specifically, data from a recent, nationally-representative sample of large-truck crashes
will be analyzed to determine the factors affecting the injury severity of these crashes.
1.2 Objectives of the Research
The objective of this study is to develop econometric models of injury-severity in
large-truck crashes. The models developed will facilitate the evaluation of a variety of
countermeasures from the stand points of transportation control, roadway design, traffic
operations, and carrier management aimed at improving safety. Despite the importance
of truck-safety, research on understanding the relative magnitudes of the influences of
the various factors affecting the injury-severity of such crashes is limited. The models
estimated in this study will include a comprehensive set of explanatory factors including
the characteristics of the crash, vehicle, truck-carrier, and the occupants.
This dissertation will also contribute methodologically to the literature on injury-
severity modeling. Two important enhancements will be incorporated. First, the use of
advanced, flexible structures such as the Ordered Generalized Extreme Value (OGEV)
model will be explored to replace the simpler and restrictive Ordered-Probit models
conventionally used in the injury-severity literature. Second, the effects of common
unobserved factors (error correlations) affecting all the injury-severity of all persons
involved in the same crash will be incorporate in the models developed in this
dissertation. Most research to date either ignore this effect (even though many crashes
involve more than one person) or focus on the injury to one particular person (such as
-
16
the car driver) involved in the crash. The contributions of this dissertation are discussed
further in Chapter 2.
1.3 Organization of the Document
A brief synthesis of the relevant literature is presented in Chapter 2. A detailed
description of the data and the sample-formation procedure is outlined in Chapter 3.
Chapter 4 presents the results of the crash-level ordered-probit models for injury
severity. The empirical results capturing the effects of several explanatory factors
including driver, vehicle, crash, environment and carrier are discussed. These models
will serve as the basis for further advanced specifications. Chapter 5 discuss how the
highest-level of injury sustained may not be a comprehensive descriptor of the overall
severity of the crash and present a methodology to simultaneously model the severity
sustained by all persons involved in the crash. In Chapter 6, we continue the exploratory
analysis of injury severity of each occupant with a panel, hetroskedastic Ordered
Generalized Extreme Value (OGEV) model, to release the constraint of ordered probit
model. A summary and conclusion of the key contributions of this dissertation and future
research is discussed in Chapter 7.
-
CHAPTER 2 LITERATURE REVIEW
This chapter presents a synthesis of literature relevant to the dissertation‘s
objective of modeling injury-severity in large-truck crashes. The rest of this chapter is
organized as follows. A review of past research on large-truck crashes is presented in
Section 2.1. A summary of studies on modeling injury-severity in automobile crashes, in
general, is presented in Section 2.2. Significant emphasis is placed on the modeling
methods employed and the key empirical results. Section 2.3 positions the dissertation
in the context of past research by identifying the gaps in knowledge and the
contributions of this study.
2.1 Research on Large-Truck Crashes
A brief synthesis of literature on large-truck (gross vehicle weight rating greater
than 10,000 pounds) crashes, with particular focus on the analysis of injury-severity of
such crashes is presented in this section of the chapter.
Work undertaken by Khattak and colleagues (Duncan et al., 1998, Khattak et al.,
2002) and Chang and Mannering (1999) are most directly related to our efforts.
Duncan et al. (1998) examined the injury severity in the case of rear-end collisions
between heavy trucks and passenger cars. The focus was on modeling the injury to the
passenger-car occupants as they are almost always likely to sustain more severe
injuries than truck drivers in crashes with large/heavy trucks. Ordered-probit models
were developed using the Highway Safety Information System (HSIS) data from North
Carolina for the years 1993-1995. The results indicate that higher speeds (and speed
differentials), darkness, and grade increase the severity of the injury. Females and
drunk-drivers were estimated to sustain more severe injuries compared to male and non
-
18
drunk drivers respectively. Snowy/icy road conditions and traffic congestion were found
to decrease the effect of the injury severity compared to respectively dry and free-flow
traffic conditions. Finally, the car being struck in the rear was found to lead to more
severe injuries compared to the truck being struck in the rear.
Khattak et al. (2002) used the HSIS data from North Carolina for the years 1996-
1998 to examine the injury severity of single large-truck crashes. In particular, the intent
was to examine the differences between rollover and non-rollover crashes. Using
ordered-probit models, the authors found that rollover leads to more severe injuries in
single-truck crashes. Further, dangerous driving behaviors such as drug/alcohol use,
and speeding, not wearing seat-belts increases the injury severity. Crashes that result in
fire are also estimated to have a greater injury severity. In this study, the authors
continued to examine the factors affecting the roll-over of trucks in single-truck crashes.
The researchers found that rollovers are more likely to happen at a right, left or U-turn
and on a curved road. Trucks with longer trailers are more likely to roll over. Reckless
driving has the largest influence on increasing rollover propensity. These factors may
also be construed as affecting the injury severity as the roller-over crashes were
established to be more severe than non roll-over crashes.
Chang and Mannering (1999) modeled the vehicle occupancy and the most-
severe injury sustained by an occupant of the vehicle using data from the state of
Washington. The need to model vehicle occupancy simultaneously with injury severity
was motivated by the observation that the possibility of a severe injury increases with
increasing number of persons in the vehicle. Nested-logit models were developed with
occupancy as the upper-level nest and injury severity as the lower level nest. Unlike, the
-
19
previous efforts discussed, Chang and Mannering adopt an un-ordered discrete-choice
structure to model injury severity. The authors segmented the data into truck-involved-
and non-truck-involved- crashes and demonstrated the statistical and empirical validity
of such segmentation. For example, the results indicate that higher speeds are strongly
associated with more-severe crashes when trucks are involved (the effect was
insignificant in the case of non-truck crashes). Similarly, the effects of turning
movements (right turn and left turn) of the vehicles on the crash severity were also
found to be different. Consistent with expectations, the results also indicated that multi-
occupant vehicles in truck-involved crashes result in significantly severe injuries.
Overall, these authors argue that counter-measures aimed at reducing the severity of
truck-involved crashes could be different from those aimed at reducing the severity of
non-truck crashes.
In contrast to the previous three studies which have examined the level of injury
severity, other researchers have focused on fatal crashes involving large-trucks.
Braver et al. (1996) examined the effect of roadway geometry, weather, and other
factors on the incidence of fatal large truck-car crashes. Defiance of traffic control
devises, curves, slippery and roadway conditions were some of the conditions found to
be associated with fatal crashes.
Campbell (1991) examined the impact of driver age on the involvement in fatal
crashes. Based on nationally-representative data for the years 1980-1984, the author
developed estimates for the risk of involvement in fatal crashes as the number of fatal
crashes per hundred million vehicle miles. The analysis indicates that younger drivers
(age < 27 years) are over-involved in fatal crashes. Further, the relative risk of very
-
20
young drivers (less than 21 years of age) was found to be about six times the overall
risk for all drivers.
Golob et al. (1987) examined the severity (both injury severity and incident
duration) of truck-involved freeway accidents. About 9000 crashes from the years 1983-
1984 were obtained from TASAS (Traffic Accident Surveillance and Analysis System)
data base maintained by the California Department of Transportation. All data were
from the Los Angeles area. Based on the number of fatalities per accident, the ―hit-
object‖ type crashes were found to be most dangerous (0.025 fatalities per accident).
―Rear end‖ and ―other type‖ (other than hit-object, side-swipe, broad-side, and overturn)
of crashes were also very dangerous (0.021 fatalities per accident).
It is useful to mention here that past studies have also examined other aspects of
large-truck safety (other than injury severity). For example, research undertaken by
Blower et al. (1993) examined the factors affecting the crash propensities (or the risk of
being involved in a crash) and show that truck crash-rate is significantly affected by
truck configuration, location (rural or urban), traffic density, and time of day. Hallmark
(2009) focused on the incidence of a specific type of crash – the lane-departure
crashes. Using logistic-regressions, the authors identify that such crashes were more
likely to happen when driver is fatigued, upset, distracted, or unfamiliar with the
roadway. More generally, driver fatigue has been recognized as an important factor
affecting truck crashes. Based on a survey conducted in New Zealand, Gander et al.
(2006) identified 7.6% of crashes were identified as fatigue-related. The duration of the
most recent sleep period was considered as a measurement of fatigue. In consideration
of the effect of driver fatigue, the hours of service (HOS) of commercial drivers are
http://en.wikipedia.org/wiki/Hours_of_service
-
21
regulated by the Federal Motor Carrier Safety Administration (FMCSA) in the United
States. Commercial motor vehicle (CMV) drivers are limited to 11 cumulative hours
driving in a 14-hour period, which must then be followed by a rest period of no less than
10 consecutive hours. Drivers employed by carriers in "daily operation" may not drive
more than 70 hours within any period of 8 consecutive days (NHTSA, 2008). Although
the primary intent of this dissertation research is on injury-severity (conditional on a
crash) and not on the risk of a crash happening, insights from studies discussed above
are useful and appropriate explanatory variables (such as fatigue) will be included in our
models.
Overall, the literature on the modeling of injury-severity of large-truck crashes
appears to be limited. Past studies have focused on specific types of crashes (such as
rollover or rear-end) or on specific injury-severity levels (such as fatal crashes). Also,
the methods employed are rather simplistic. In this context, the intent of this dissertation
is to present a comprehensive analysis of the injury severity of all types of crashes
involving large-trucks. Flexible econometric structures and a comprehensive empirical
specification will be developed.
2.2 Research on Modeling Injury-Severity of Automobile Crashes
Although few studies have analyzed injury-severity in large-truck crashes, the
body of literature on modeling injury severity, in general, is extensive. A substantial
fraction of these are focused on automobile crashes and these are discussed in the rest
of this section. It is envisioned that methodological- and empirical- insights from these
past studies will inform our research on large-truck crashes. It is also useful to
acknowledge that injury-severity of motorcycle, pedestrian, and bicycle crashes have
also been studied in the past. To limit the scope of our literature review, we do not
http://en.wikipedia.org/wiki/Federal_Motor_Carrier_Safety_Administrationhttp://en.wikipedia.org/wiki/United_Stateshttp://en.wikipedia.org/wiki/United_States
-
22
present an extensive discussion of these studies. However, advanced methods used to
model such crashes are discussed wherever appropriate.
Table 2-1 summarizes the key features from several studies in literature on
modeling injury-severity of automobile crashes. Five important features are discussed in
separate sub-sections: (1) The Injury of Interest, (2) Levels of Injury Severity, (3) Data
Source, (4) Modeling Method, and (5) Explanatory Factors. The first three studies listed
in the table are the ones that explicitly focus on large-truck crashes (also discussed in
Section 2.1).
2.2.1 Injury of Interest
Automobile crashes could potentially involve one or more vehicles, and each
vehicle could have one or more occupants (including the driver of the vehicles). All
these occupants could have different levels of injury severity. Correspondingly, there
are differences in the injury-severity of interest across the studies presented in Table 2-
1.
In the simplest case, some studies have defined the overall severity of the crash
as the most-severe injury sustained by any person involved the crash. Alternately,
others define the injury-severity of each vehicle as the most severe injury sustained by
any person in that vehicle (Chang and Mannering (1999), Chang and Wang (2006) and
Milton et al. (2008)). Some studies have focused specifically on the injury sustained by
the driver of the vehicles (Kockelman and Kweon, 2002, Eluru and Bhat, 2007,
Yamamoto et al., 2008, Wang and Abdel-Aty, 2008, Delen et al., 2006 and Xie et al.,
2009). Others have focused on specific occupants such as driver and front seat
occupant (Newgard, 2008), and front seat and rear seat passengers (Shimamura et al.,
2005). At the other end of the spectrum are studies that have examined the injury
-
23
severity of each of the occupants involved in the crash. (O‘ Donnell and Connor, 1996,
Kuhnert et al., 2000, Khattak et al., 2003, Chang and Wang, 2006 and Eluru et al.,
2009).
The focus on the most-severe injury is appropriate from the stand point of data as
the injury sustained by every person involved in the crash is often not accurately
recorded (the most severe injury is generally well-recorded –Chang and Mannering
(1999)). At the same time, models of the injury sustained by every occupant involved in
the crash (subject to data availability) present a comprehensive description of the
overall severity of crashes. In light of the above discussions, this dissertation will
develop models at both the crash-level (most severe injury) and the occupant level. The
data available support such an effort and are described in detail in the next chapter.
2.2.2 Levels of Injury Severity
Injury severity is recorded in an ordinal scale. The number of categories used in
modeling range from two (high or low, Ouyang et al., 2002) to seven (no injury, minor,
moderate, serious, severe, critical, and non-survivable, Newgard, 2008). Four- and five-
categories are more common. The ―KABCO‖ is the most common scale used ( for
example, Duncan et al., 1998) with ―K‖ being the most severe category representing a
fatal crash, ―A‖ representing incapacitating injury, B representing non-incapacitating
injury, C being minor injury, and ―O‖ representing the least severe category (no
injury/property damage only). Most state- and national- crash databases use this scale.
Consistently, four- and five-level ordinal scales are most commonly used in the models
for injury severity.
The dataset used in this research provides two measures of injury severity. The
first measure is based purely on police records and the second derived from additional
-
24
hospital data and interviews (further details provided in the next chapter). The first
measure uses a four-level scale whereas the second uses a three-level scale. Models
will be developed using each of these measures.
2.2.3 Data Sources
Most research in injury-severity modeling has used data from national- and state-
level sources. For instance, the Crashworthiness Data System (CDS) from the National
Automotive Sampling System (NASS) was used by Wang and Kockelman (2005) and
Newgard (2008). The General Estimates System (GES) from the National Automotive
Sampling System (NASS) was used by Delen et al. (2006) and Eluru et al. (2009).
Chang and Mannering (1999) and Yamamoto et al. (2008) used the state-level
Washington State Highway Accident Records Database. Non US-data such as the
Linked Accident Database from Japan (Shimamura et al., 2005) and French database
of accident (Lapparent, 2008) have also been used.
The data to be used in this study come from the Large Truck Crash Causation
Study (LTCCS – discussed in more detail in Chapter 3). The database assembled by
this study augments the conventional crash-data obtained from police reports in several
ways. For instance, additional data related to ―human factors‖ such as the fatigue,
illness, and distraction of the drivers was collected. Historical records on the safety of
the drivers, vehicles, and carriers (past violations and citations) involved in the crashes
were also obtained and added to the crash data obtained from the police accident
reports. Thus, the database available for this study would enable the development of a
richer empirical specification.
-
25
2.2.4 Modeling Method
Almost all past research on injury-severity modeling has employed
statistical/econometric methods. Exceptions include the Classification and Regression
Tree (CART) method used by Kuhnert et al. (2000) and Chang and Wang (2006) and
Artificial Neural Networks used by Delen et al. (2006). While the CART and Neural
Network methods can help establish very flexible and non-linear relationships, their
value as descriptive models is limited because it can be extremely difficult to interpret
the marginal effects of various factors as implied by the estimated relationships. This
ability to obtain physical meanings of the parameters is important from the standpoint of
identifying appropriate counter-measures to reduce crash severity. Further, such
methods, unlike the statistical approaches, often do not have estimates of the strength
of correlations (i.e., the ‗p‘ values). Finally, methods like CART can be difficult to apply
with increasing number of explanatory variables. This dissertation focuses on the use of
econometric methods. In the rest of this section, such methods used in past research
for injury-severity modeling are discussed in detail.
There are two fundamental issues in the modeling of injury-severity of crashes: (1)
the treatment of the ordinal, injury-severity variable in the modeling process and (2) the
incorporation of interdependencies (error correlations) among the injury-severity of the
different persons involved in the same crash.
2.2.4.1 Treatment of ordinal
As discussed in Section 2.2.2, injury-severity is generally recorded in an ordinal
scale. Most commonly, the following five-level scale in the increasing order of injury
severity is used: Property-damage only, Possible Injury, Non Incapacitating Injury,
Incapacitating Injury, and Killed (Fatal). Thus, ordered-response discrete-choice models
-
26
(ordered probit or ordered logit) are appropriate for the analysis of such data. In fact,
this has been a popular approach for modeling injury severity in general (For example,
Kockelman and Kweon, 2001; O‘ Donnell and Connor, 1996).
In this approach, the observed, ordinal, injury-severity level is related to an
unobserved (latent), continuous injury propensity, which is then related to a vector of
explanatory variables corresponding to the crash via a linear-in-parameters
specification.
Wang and Kockelman (2005) and O‘ Donnell and Connor (1996) employed the
heteroscedastic variant of the ordered response model (the conventional ordered-
response model assumes homoskedasticity, or equal variances). Thus, their
specifications allowed the variance in the error term to vary systematically as a function
of certain exogenous factors such as speed, vehicle type, vehicle weight, time of day
and occupant characteristics.
In general, the maximum-likelihood procedure is used for model estimation.
However, Xie et al. (2009) adopted the Bayesian inference procedure. The authors
compared the Bayesian and the non-Bayesian methods and concluded that the results
are similar with large samples.
There are several advantages to using the ordered-response models. Such
models explicitly recognize the ordering in the levels of injury severity. The specification
is parsimonious (fewer parameters as there is only one propensity function to estimate).
Finally, the interpretations are straightforward. Generally, a positive coefficient on a
variable implies that the corresponding explanatory factor is associated with more
-
27
severe crashes and a negative coefficient implies that the corresponding variable is
associated with less severe crashes.
A key shortcoming of the ordered-response models is that it is restrictive in
capturing the effects of explanatory variables on the different levels of injury severity.
Specifically, if a factor is estimated to increase the probability of the most-severe injury
(i.e., fatal injury), then the specification implies that the same factor necessarily
decreases the probability of the least-severe injury (often this is the ―no-injury‖
category). However, this may not always be true. For instance, it has been shown that
airbags decrease the likelihood of fatal injuries in the event of a crash. At the same time,
the deployment of airbags can also cause minor injuries and hence decreasing the
likelihood of least-severe injuries. Simple ordered-response models cannot capture this
effect. Researchers have attempted alternate approaches to address this issue and
these are discussed in the rest of this section.
The ordered-response models can be directly extended to address this issue by
allowing for variable threshold parameters. Eluru et al. (2008) have applied such a
―mixed generalized ordered logit model‖ to study injury severity of pedestrians and
bicyclists. Another extension of the conventional ordered-response structure is the
Partial Proportional Odds Model (Wang and Abdel-Aty, 2008). This approach also
allows for the coefficients on the explanatory variables to be different across the injury-
levels. However, relative to the simpler ordered-response models, the interpretations of
the parameters from these advanced models are not straightforward.
Another approach to address the above issue, while still having easily
interpretable structures, is to use unordered specifications such as the nested-logit
-
28
model (for instance, Savolainen and Mannering (2007) and Abdel-Aty et al., 2003).
Such models require the specification of a utility function corresponding to each
alternative (unlike the ordered-probit models which use a single propensity function and
fixed thresholds) and, hence, overcomes, the restrictive empirical specification.
However, the use of nested-logit structure implies that the ordering of the injury-severity
levels is ignored. The nested-logit models require that each alternative belong to only
one nest, and hence, the error-correlations between adjacent levels of injury severity
cannot all be effectively captured. A particular extension of the nested-logit model that is
appropriate in the context of ordered-alternates is the Ordered Generalized Extreme
Value (OGEV) model but this structure has not been applied to injury-severity modeling.
The most flexible error-correlations across the choice alternatives can be captured by
the use of mixed-logit models (Train, 2009). With the exception of Milton et al. (2008),
these methods have not been applied to injury-severity analysis. It also appears that the
primary motivation for the above researchers to use the mixed-logit model is to capture
heterogeneous impacts of explanatory variables on the injury severity by using random
coefficients rather than to capture flexible error correlations.
A third approach to capturing the ordinality among the alternatives is to model the
ordered choice as a sequence of binary choices where each binary choice involves
choosing a specific level relative to higher (or lower levels). Dissanayake and Lu (2002)
used such a sequential binary approach. Two sequential model structures where
estimated – one in which the injury severity varied from lowest level to the highest and
the second in which the severity was varied from the highest to the lowest. However
these researchers assumed that the binary choices were independent. Yamatoto et al.
-
29
(2008) pointed out that the correlation existed in the error terms and was especially
stronger in the successive two levels and developed an improvement which
accommodates correlations (partially) among the successive levels.
2.2.4.2 Incorporating interdependencies among the injuries of all persons involved in the crash
Most of the injury-severity models can be classified as ―single-equation‖ models
and these do not consider the correlations among the injuries sustained by all persons
in the same crash or in the same vehicle. Few studies have used the bivariate ordered-
response structures to account for correlations among two persons involved in the same
crash. For example, Hutchinson (1986) analyzed the severity of injuries sustained by
the driver and front-seat passenger simultaneously. Yamatoto and Shankar (2004)
applied model to the driver and the most severely injured person in the vehicle. Ouyang
et al. (2002) studied rear-end crashes involving trucks and cars. The injury-level (on a
binary scale) associated with both vehicles were estimated simultaneously
Most recently, Eluru et al. (2009) used a copula-based approach to accommodate
the dependence in injury-severity propensities among the multiple occupants of the
same vehicle (the dependencies among the different vehicles in the same crash, were,
however not considered). Copulas are functions that generate stochastic-dependence
relationships (i.e., a multivariate distribution) among random variables with given
marginal distributions.
While mixed-models and error-component structures have been routinely used in
other fields (such as economics and travel-demand modeling) to estimated correlated
models, it appears that such methods have had limited applications in the context of
http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6V5S-4B0PPJY-1&_user=2139813&_coverDate=09%2F30%2F2004&_rdoc=1&_fmt=full&_orig=search&_cdi=5794&_sort=d&_docanchor=&view=c&_searchStrId=1078157190&_rerunOrigin=google&_acct=C000054276&_version=1&_urlVersion=0&_userid=2139813&md5=8d031b63098876807386c505f71c71c7#bib7
-
30
injury-severity analysis. This dissertation will contribute to the literature by estimating
such advanced econometric models for injury-severity.
2.2.5 Explanatory Factors
The last major column in Table 2-1 identifies the primary explanatory factors used
in past injury-severity models. These factors are classified into the following six
categories: (1) crash characteristics, (2) driver/occupant characteristics, (3) vehicle
characteristics, (4) environment characteristics, (5) roadway characteristics, and (6)
occupant protection characteristics. Each of these is discussed in the rest of the this
section
2.2.5.1 Crash characteristics
The characteristics of the crash that were estimated to influence injury severity
include time of day (such as peak or off-peak and daylight or dark), crash type (frontal,
rear-end, rollover or else), and other crash descriptives (number of vehicles involved, at-
fault driver, harmful event/causation).
NHTSA (2008) reports that the period from midnight to 3 a.m. on Saturdays and
Sundays was the deadliest of all 3-hour periods with more fatal crashes than any other
time period (NHTSA, 2008). Kockelman and Kweon (2002) also find that crashes on
Friday, Saturday and Sunday during late night (midnight to 4am) are more severe and
late night on Sundays was the most dangerous time. Analysis results from Chang and
Mannering (1999) suggest that the night time was more dangerous and rush-hour
crashes were less severe. Eluru and Bhat (2007) report that crashes occurring between
6am-7pm were less severe than crashes during other time period. Eluru et al. (2009)
also report that 12am to 6 am was the most dangerous time. Friday afternoons were
also shown to have more crashes but less fatal crashes in NHTSA (2008). It is
-
31
important to note that the time-of-day of the crash could be reflective of traffic volumes,
speeds, and lighting conditions. All the above results indicate that crashes during darker
and less-congested time periods are more severe than crashes during brighter and
more-congested times.
Chang and Mannering (1999) found that summer was the most dangerous season
followed by spring and autumn. Many other studies did not have statistically significant
effects of the season possibly because these effects are captured by variables
describing the road-surface condition (for instance wet/icy/snowy conditions could be
reflective of winter conditions). It is useful to note that Chang and Mannering did not
control for road-surface conditions.
Crash type is another important factor affecting injury severity. Head-on crashes
and crashes with a stationary object were most dangerous and the vehicle being struck
received higher injury severity relative to the striking vehicle (Eluru et al., 2007).
Kockelman and Kweon (2002) as well as Duncan et al. (1998) report that rollover
crashes can result in more severe injuries. Other types of crashes such as the angle
and sideswipe have not been extensively examined. Crashes that lead to fire are found
to result in more severe injuries as would be expected. As hazardous cargo can lead to
fire in the event of a crash, countermeasures aimed at improving the safety of trucks
carrying such cargo becomes more important.
The number of vehicles involved in the crash was also important from the
standpoint of injury severity. NHTSA (2008) estimates that multi-vehicle crashes were
more dangerous using aggregate data and this result was also supported by
econometric models estimated by Chang and Mannering (1999). Yamamoto and
-
32
Shankar (2004) also report that more passengers in the vehicle increase the severity of
the most severe injury in the accident. At the same time, these researchers also find
that increasing number of passengers in the vehicle decreases the injury severity of the
driver. Such results suggest that focusing on the most-severe injury or the injury
sustained by one of the occupants is not adequate. Rather, the injuries sustained by all
persons involved in the crash must be studied to have a comprehensive understanding
of the crash severity.
Vehicle movement at the time of crash is also a factor influencing injury severity.
Crashes while negotiating curves and passing other vehicles were shown to have
higher probabilities of fatalities compared with other kinds of movements such as
turning and merging (NHTSA, 2008). Among turning movement, left-turns might be
particularly critical because of the possibility of conflicts with opposing streams of traffic
(for instance the study on left-turn crashes at intersections by Wang and Abdel-Aty
(2008)). The impact point has also been found to determine injury severity. Based on
research by Delen et al. (2006) occupant in the vehicle that is struck is more likely to
sustain severe injuries compared to the occupants in the striking vehicle.
2.2.5.2 Vehicle characteristics
The age, size, engine, and other characteristics of the vehicle(s) involved in the
crash affect the injury severity.
The research by Kockelman and Kweon (2002) indicate that drivers of light- and
heavy-duty trucks and minivans are better protected against injuries. Yamamoto and
Shankar (2004) report that drivers of large trucks sustain less-severe injuries. Khattak
and Rocha (2003) studied sport utility vehicles (SUVs) and found that SUVs were more
likely to rollover, but its protective effect exceeded the harmful effect caused by rollover.
-
33
Therefore, on comparing with passenger cars, SUV occupants have less-severe injuries
in the event of a crash. Eluru and Bhat (2007) showed that drivers in sedans were more
likely to be injured heavily comparing to others in dual-vehicle crash. According to
Kockelman and Kweon (2002), in two-vehicle crashes, heavy-duty trucks result in more
severe injury for the driver of the partner vehicle. Consistent results were achieved from
other research focusing on effect of vehicle types. In general, it appears that occupants
of heavier vehicles often have less-severe injuries. At the same time, if heavier vehicles
are involved in the crash, the overall severity of the crash could be higher because of
greater injuries to the occupants of the other vehicle(s) involved in the crash.
Some other factors such as vehicle age have also been studied. For example,
Yannis et al., 2005 report that older vehicles (age > 35 years) are involved in more
severe crashes. Khattak et al. (2003) report higher injury-severity to be associated with
large trucks manufactured before 1992.
2.2.5.3 Driver and occupant characteristics
As already discussed, many studies have focused on the injury sustained by the
drivers of vehicles. NHTSA (2008) estimates that drivers comprised 63% of the total
persons injured or killed in crashes and passenger accounted for only 28%. Therefore,
the substantive focus on drivers seems appropriate.
The age of the drivers and the vehicle occupants has been found to be strongly
related to injury severity. Wang and Abdel-Aty (2008) reported that very young
(age≤19) and young drivers (19 < age ≤24) were more likely to sustain severe injuries.
Based on fatalities and injured rates per 100,000 population (NHTSA, 2008), men aged
21-24 and women aged 16-20 had the highest fatality rates, while both males and
females aged16-20 had the highest injury rates. Eluru et al. (2009) report that younger
-
34
drivers (16-20) are less likely to be severely injured comparing with driver over 65 years
if age. Overall, there appears to be a non-linear effect of age on injury severity with the
youngest and the oldest being susceptible for more severe injuries (arguably for very
different reasons) compared to the middle-aged.
Eluru et al (2009) also examined the effect of driver age on the injuries sustained
by other occupants in the vehicle. Driver over 45 years old were estimated to be
associated with more severe injury to front seat passengers, but driver‘s age did not
significantly affect the rear passenger. Children less than 5 years old were less likely to
be highly injured when seating in the rear, and passengers over 65 are more likely to be
severely injured.
Gender is also expected to influence injury severity. Based on, aggregate,
national-level crash data from 2007 (NHTSA, 2008), men had a significant higher rate of
being killed or severely injured compared to women. However, econometric models
developed by Eluru and Bhat (2007) and Kockelman and Kweon (2002) suggest that
men suffer less severe injuries compared to women, after controlling for several other
factors that affect crash severity.
Alcohol is another important factor studied. There were 12,998 alcohol-impaired
driving fatalities in 2007 which accounted for 32% of all traffic fatalities for the year.
Among the fatal crashes occurring from midnight to 3 a.m., 65% involved alcohol-
impaired driving. All studies on alcohol (Duncan et al., 1998 and Eluru and Bhat, 2007)
confirmed that drivers under influence of alcohol were likely to be more-severely injured
than others and Eluru et al. (2009) also estimated that the alcohol consumption of the
-
35
drivers also affected the injury severity of the passengers. It is useful to note here that
alcohol records for passengers are not required in Police Accident Reports (PAR).
Driver fatigue is another critical aspect that affects injury severity. Gander et al.
(2006) reported that 41%-71% of the truck crashes were related to fatigue. Srinivasan
(2003) estimated that fatigued drivers were five times as likely to be serious injured and
faced a 30% lower chance of experiencing a property damage only (PDO) event.
Sleepy drivers were more likely to be involved in more-severe crashes (Khattak et al.,
2003). Fatigue and sleeping habits can be expected to be even more important
attributes of truck drivers compared to car drivers.
Speeding was a main causation of crashes and contributed 31% of all fatal
crashes in 2007 (NHTSA, 2008). For drivers involved in fatal crashes, young males
were confirmed to be more involved in speeding and speeding was clearly a deadly
combination with alcohol.
Data on drivers‘ history related to traffic violations have also been studied. Drivers
with violation history faced about 15% and 22% increase in the probability of moderate
and severe injuries (Srinivasan, 2003).
For all the passengers in a vehicle, the seating position is also important. O‘
Donnell and Connor (1996) obtained that the left-rear seating position is the most
dangerous seating position. Newgard (2008) concluded that seat position has a strong
correlation with passenger‘s age.
2.2.5.4 Environmental characteristics
Environmental factors affecting injury severity include weather and light conditions.
In general bad weather has been found to be associated with lower injury severity
(Duncan et al., 1998 and Yamamoto and Shankar, 2004). This is possibly because
-
36
drivers are inherently more cautious and do not speed during bad weather (Eluru and
Bhat, 2007). Weather conditions are also reflective of the road surface conditions (for
example, rainy weather also leads to slippery road surface). The snow intensity and
wind gust speed were studied by Khattak and Knapp (2001). The results indicated that
higher wind gusts during snow events tended to result in more-severe injuries. The
negative effect of snowfall intensity on injury level was explained as greater snow
accumulation due to higher snowfall intensity acted as an attenuator.
Light condition has more complex results than weather. Wang and Kockelman
(2005) concluded that lack of light decreased the severity in one-vehicle crashes, but
increased the injury severity in two-vehicle crashes. Driver was considered to be less
injured in dusk or dark with lighting by Eluru and Bhat (2007) given the same reason as
bad weather. Huang et al. (2008) consisted that crashes at night were more serious
than those during daytime and a bad street lighting condition could increase the odds of
severe crash by about 69%.
Given the strong interdependencies between weather, lighting conditions, and
driving behavior (drivers are inherently more cautious during bad weather and bad
lighting conditions and perhaps the converse is true under good weather and good
lighting conditions), additional research on effectively disentangling the marginal effects
of these correlated factors is needed.
2.2.5.5 Roadway characteristics
Roadway characteristics include roadway design, location, surface condition,
traffic control and traffic volume.
Because speed is an important factor of severe injury and the actual speed is
seldom recorded, the speed limit is often used as an important proxy variable in injury-
-
37
severity studies. As commonly admitted, roads with higher speed limits have a higher
injury severity. The medium-to-high speed limit (26-64mph) was most dangerous and
the high speed limit (>65mph) cause more severe injury than low speed limit roads
(45 mph) was
considered to be dangerous to car drivers at intersection and to both car and truck
drivers at straight section.
Speed limit is also indicative of the location of the crash, such as interstate
highway, or state road. High speed exists on good road condition, mainly on highway,
and lower speed may exist on rural road and ramp on the highway. Chang and
Mannering (1999) estimated that rural area was more dangerous than urban area.
Furthermore, road classification, curve, uphill or down hill also influenced the injury
severity combined with the speed limit. The result from Wang and Kockelman (2005)
stated that crashes on curves were more severe, while uphill increased the injury level
in one-vehicle crashes but decreased the injury in two vehicle crashes. Road signs also
contributed to the traveling speed, such as flash light, electronic display for the traffic
condition, and sign for the camera. The validity of these speed control methods was
studied (Alicandri and Warren, 2003, Sarasua et al., 2006).
Another crucial factor of injury is the traffic condition, which is usually a temporal
factor. Significantly, traffic varies by peak hours and off peak hours of a day. Night time
has less traffic and higher speed. Without considering special occasions, such as
-
38
congestion caused by incident, sport game or work zone, traffic is shown by the time of
the crash. Special traffic conditions causing trouble in traffic also cause probability of
crashed and injuries but these records are not available in most police record data.
Higher truck percentages on the roadway were shown to decrease the injury
severity of accidents (Milton et al. 2008) and this was explained by slowing effect on
travel speeds of the sheer number of trucks.
2.2.5.6 Occupant protection
Occupant protection mainly refers to seat belts, airbags.
NHTSA (2008) estimated that 15,147 lives were saved in 2007 by the use of seat
belts. Seat belt use was studied by Eluru and Bhat (2007), which indicated that men,
younger individuals (age
-
39
which an airbag was not deployed. This was because airbag deployed when the vehicle
was struck violently and the passenger was shocked seriously.
2.3 Contribution of this Dissertation
The discussions thus far clearly highlight the value of improving truck-safety and
studies conducted to date on identifying appropriate countermeasures. In light of the
above discussions, the following are the main empirical and methodological
contributions of this dissertation.
Empirical Contributions:
The focus of this research is on large-truck crashes. Despite the unique aspects of
truck-crashes and the importance of safe trucking to the economy of the country,
research on the factors affecting injury-severity has been limited. This study contributes
by using a recently-assembled national-level database to develop econometric models
to relate injury-severity to a wide variety of explanatory factors. A rich empirical
specification will be developed that incorporates the marginal effects of several
explanatory factors including crash characteristics, driver characteristics, vehicle
characteristics, environment and roadway characteristics.
Methodological Contributions:
Almost all past research in injury-severity modeling has used relatively simpler and
restrictive methods for analysis. For example, a key shortcoming of past research is
that the interdependencies among the injuries sustained by the different persons
involved in the same crash have largely been ignored. These will be explicitly
incorporated in our models leading to more comprehensive descriptions about the
overall severity of multi-person multi-vehicle crashes. A second key shortcoming of the
popularly-used ordered-response models is that it is restrictive in capturing the effects of
-
40
explanatory variables on the different levels of injury severity. Specifically, if a factor is
estimated to increase the probability of the most-severe injury (i.e., fatal injury), then the
specification implies that the same factor necessarily decreases the probability of the
least-severe injury (often this is the ―no-injury‖ category). This research will use
advanced methods such as the OGEV models to develop flexible empirical structures.
-
41
CHAPTER 3 DATA
In this chapter, we first present an overview of the source and characteristics of
the LTCCS data. Then, the detailed procedures for data cleaning and variables
reduction are stated and the description of sample data is presented at the end of this
chapter.
3.1 Data Source and Raw LTCCS Data Characteristics
This study uses data from the Large Truck Crash Causation Study (LTCCS).
These data represent a sample of large-truck crashes that occurred between April 2001
and December 2003. Data on approximately a thousand crashes were collected from 24
sites in 17 states. Each crash in the LTCCS sample involves at least one large truck
and resulted in at least one injured. These data were collected by the Federal Motor
Carrier Safety Administration (FMCSA) and the National Highway Traffic Safety
Administration (NHTSA) of the U.S. Department of Transportation (DOT).
USDOT/FMCSA (2006) and Hedlund and Blower (2006) provide an overview of the
data. The full data and related documentation are available for download from the
LTCCS website at: http://ai.fmcsa.dot.gov/ltccs/default.asp.
The raw database has over a thousand data fields and is organized into 58
different files. The data sample includes 1,070 cases with 2,284 vehicles (including
1,141 large trucks and 1,043 passenger vehicles) and 3,014 occupants (including
drivers and passengers). The crashes resulted in 251 fatalities and 1,499 injuries. Injury
severity is categorized into no injury, possible injury, non-incapacitating injury,
incapacitating injury, and killed.
http://ai.fmcsa.dot.gov/ltccs/default.asp
-
42
Elements that influence the severity are drawn from 43 data files and categorized
into Injury Severity, Crash Description, Characteristics of the Occupant, Characteristics
of the Driver (demographics, fatigue, history of crashes/violations, behavior, health,
drugs/alcohol), Characteristics of the Vehicle (physical attributes, cargo, history of
crashes/violations), Characteristics of the Environment (roadway features, weather) and
Characteristics of the Carrier.
Furthermore, the dataset can be divided to levels of occupant, vehicle, truck, crash
and carrier to obtain better description and analysis of the factors since it is collected by
several departments focusing on different fields, and some records are not complete
enough for all the crashes.
3.2 Sample Formation Procedure
Extensive data processing was undertaken which involved cleaning, consistency
checks, and variable selection (this was particularly important given the significant
correlations observed among the different variables in the database) using the SPSS
statistical software.
3.2.1 Selecting Cases
Some cases in the raw data are from the pilot phase of the project or found to not
meet the selection criteria. RATWeight, is used as a weight to produce nationally
representative estimates, and zero weight cases should be dropped first. After dropping
107 zero weight cases, there remain 963 cases for further processing.
3.2.2 Cleaning and Consistency Checking
Because of the existence of useless variables and inconsistent cases, the
procedures of cleaning and consistency checking are used for sample formation to keep
the data effective and consistent.
-
43
Sample cleaning starts from the frequency test for all factors, because a variable
is invalid if over 99% of the variable is distributed as the same value. Therefore, for
each category, we test the frequency at corresponding levels, e.g., injury severity at
occupant level, crash and environment description at crash level, characteristics of the
occupant for car and truck occupant separately, characteristics of the vehicle and driver
at car and truck level, and characteristics of the carrier for available carrier records, and
drop the binary variables with less than 1% or more than 99% present cases.
The consistency checking includes two steps. First, we check variables within
each category, such as crash type, driver and occupant number, time of day and
daylight. Second, we check across the category, for example, aggregated vehicle
record and total number of vehicles, crash type across crash and vehicle level, number
of vehicles and crash type. The results are shown to be mostly consistent. The number
of inconsistent cases is not large enough to affect the analysis result. After correcting
some apparent inconsistent cases according to the variables recorded in detail, we
dropped the other 10 inconsistent cases. The sample size by each level is shown in
Table 3-1 and the variables after cleaning and consistency checking from raw data are
displayed in Appendix.
3.2.3 Variable Selection
The number of variables is limited by the sample size. Variable selection can
merge the correlated variables and reduce the number of variables in the model so that
to simplify the modeling procedure and reduce the running period.
3.2.3.1 Crosstab check
If two or more variables describe the same or correlated field of a subject, they are
expected to be correlated. Taking the roadway traffic ( a category) and the restriction (b
-
44
category) as an example, restriction on the road is correlated with the situation of the
traffic condition. Merging the two variables into one and re-combining the values
reasonably can reduce the ba categories to bac . The same situation exists for
other variables, such as seatbelt use, eye vision, carrier status and etc.
3.2.3.2 Classification analysis
When dealing with over three correlated or similar variables, since cross tabulation
analysis may be required several times, classification analysis can be applied to reduce
the variables easily. In the cleaned dataset of crash characteristics, seven variables are
used to describe weather condition before and at the time of crash (ENVRain,
ENVSnow, ENVFog, AFTRain, AFTSnow, AFTFog, and Weather). Hierarchical cluster
command can merge these seven binary variables into clusters by distance or similarity
measures. The cluster output implies that variables of ENVRain, ENVFog, AFTRain and
AFTFog can be aggregated into one group and ENVSnow and AFTSnow into another
group. For continuous variables such as driver height and weight, after grouping the
cases by each ten centimeter and kilogram, there are still over 6 groups for these two
variables. The K-mean cluster method can be used to group the cases into clusters
using the distribution of these two variables.
3.2.3.3 Missing data
There were several variables (particularly those describing driver and vehicle
characteristics) of potential interest that had missing values for a significant fraction of
the cases. Further, it was also found that these values were ―systematically missing‖ for
many of the variables. For example, the value of certain variables could be more likely
to be missing for crashes with greater severity. Alternatively, the value of other variables
could be more likely to be missing for crashes with lesser severity. Consequently,
-
45
simply removing the cases with missing values would skew the sample (in addition to
reducing the sample size). Therefore, we chose to retain all cases, and the indicator
variables were created to explicitly identify cases with missing values for each of the
variables of interest. These indicator variables were also included in the model
specifications and some of these turned out to be statistically significant (discussed
further in the next section).
Missing data in this dataset is a common situation. To deal with missing data, it is
better to start from the types of missing, where, proper methods are required to process
the data.
Harrell (2001) presented the missing data as three types: Missing completely at
random (MCAR), Missing at random (MAR) and Informative missing (IM).
Missing completely at random (MCAR) refers to the situation that data elements
are missing for reasons that are unrelated to any characteristics or responses for the
subject. Examples include a subject omitted the response to a question for reasons
unrelated to the response she made to her characteristics (miss the question), loosing
the data because of a mistake in the experiment. This kind of missing data will be in
small numbers.
Missing at random (MAR) does not mean that data elements are missing at
random, but the probability that a value is missing depends on values of variables that
were actually measured and does not depend on the unobserved data. An example is
the large truck weight is not all recorded because not all the carriers provide this
information, but the carrier information is recorded.
-
46
Informative missing (IM) is missing elements more likely when their true values of
the variable are systematically higher or lower. For example, a heavily injured driver
would have difficulty in helping record the driving information, such as speed, emotion
or road familiarity. IM is also called nonignoreable nonresponse.
In the studied dataset, large numbers of variables have values recorded as
unknown. When this variable is a record for the crash type, location and some other
existing situation, this is a most common missing data completely at random (MCAR) in
the dataset. The unknown values are not huge and ignorable. Usually, join these
unknown values with some other low frequency values together as a new value. If the
values are under 1%, join in any group will not affect the result significantly. For
unknown value over 1%, it can be recoded to other values by linear interpolation or
mean. This procedure can reduce the values for the variables.
Another kind of ―unknown‖ or missing record is not MCAR and is nonignoreable.
Such as alcohol exist at crash level, 11.9% report as unknown. This is maybe caused
by the disability to make a record for high severe injured people or carelessness when
recording a no injury crash. In the crash report, there are also huge missing records,
such as the cargo weight, driver behavior and etc., recorded as unknown. For these
types of missing data, keep the unknown as a value and model the impact of unknown
on the response of injury.
The same unknown situation can happen in the records of individual age, gender,
speed limit and etc. But if the count of unknown for the variable is not large, it is not
necessary to keep the unknown as a special value. For the low frequency (2%)
-
47
unknown value, consider it as ignorable and replace the unknown value by mean,
median or linear trend.
3.3 Sample Characteristics
The final ―reduced‖ dataset assembled for this analysis was organized into the
three major files: Crash-level data (including highest level of injury severity in the crash,
crash type, and environment variables), Vehicle-level data (characteristics of the trucks
and cars such as age, body-type, cargo, occupancy levels, and deficiencies), and
Driver-level data (characteristics of the truck and car drivers including demographics,
fatigue, health, and behavior). The effects of all these different variables were examined
during the statistical-modeling procedure.
At the crash level, we have 953 cases in the estimation sample. For occupant
injury study, we select 918 representative crashes, which have at most 3 trucks, at most
3 cars and at most 5 occupants per vehicle (including the driver). Totally, there are 2374
occupants in the selected sample, including 1038 truck drivers, 145 truck occupants,
818 car drivers, and 373 car passengers.
Table 3-1. Sample size for each category.
Variable category Level Sample size cases Injury severity Occupant 2699
Crash
Vehicle 2056
Crash 953
Truck and truck driver Truck 1108
Car and car driver Car 941
Truck occupant Truck occupant 151
Car occupant Car occupant 499
Carrier Truck carrier 733
-
CHAPTER 4 MODELS FOR CRASH LEVEL INJURY
The intent of this chapter is to present a comprehensive analysis of the ―injury
severity‖ of all types of crashes involving large-trucks. Because this is the first step of
this dissertation, as many variables as possible are estimated to understand the
relationship between injury severity and a vast number of inter-dependent explanatory
factors using crash level data. The injury severity of a crash is defined as the highest
level of severity among all those injured in the crash.
4.1 Sample Data
There are 953 crashes in the estimation sample. The data were collected from
both police accident reports and additional sources such as site investigation,
interviews, and review of medical records conducted by the LTCCS team. Each crash
was investigated by a two-person team comprising a trained researcher and a state
truck inspector (USDOT/FMCSA, 2006 for an overview). Correspondingly, there are two
measures of injury severity available for each crash: (1) a measure determined from the
police accident reports (referred to as PAR) and (2) a measure determined by the
LTCCS researchers based on medical records and case narratives (referred to as RES
for researcher-determined).
A cross tabulation of the highest injury severity level of the 953 crashes
determined from each of PAR and RES is presented in Table 4-1. In the case of PAR,
the severity was measured on a four-item scale: possible injury (14%), non
incapacitating injury (32%), incapacitating injury (32%), and killed (22%). In the case of
RES, the severity was measured on a three item scale: non incapacitating injury (48%),
-
49
incapacitating injury (29%), and killed (23%). This is because the crashes in the LTCCS
were sampled such that there is at least one injured person on the RES-scale.
The cross-tabulation indicates that fatal crashes are recorded almost identically by
both measures (approximately 22% each). However, discrepancies are observed in the
case of non-fatal crashes. About 12% of all crashes (119 crashes) that were classified
as level C by PAR were classified as level B by RES indicating under-estimation of the
injury severity by the police reports (consistent with the findings of others such as Tsui
et al., 2009). At the same time, 103 crashes classified as level A by the PAR were
classified as level B by the RES indicating an over-estimation by the police reports.
Given these discrepancies, it is useful to compare models estimated using each of
these severity measures. Such an effort is undertaken in this study.
It is also important to note that the sampling of the crashes was not purely random.
Therefore, weights have been calculated to scale the sample to be nationally
representative (the RATWEIGHT variable described in user‘s manual by USDOT,
2006). Table 2 also presents the weighted percentages of the injury-severity levels
according to the two measures (the last row and last column). The results indicate that
crashes with higher levels of severity (particularly fatal crashes) have been
oversampled.
-
50
Table 4-1. Cross tabulation of police-determined and researcher–determined injury severity levels.
RES
Possible Injury
Non-incapacitating
Injury
Incapacitating Injury
Killed Total Percentage Weighted
Percentage
PAR
Possible Injury 0 119 15 0 134 14.06 14.39
Non-incapacitating
Injury 0 239 61 2 302 31.69 31.32
Incapacitating Injury
0 103 192 10 305 32.00 45.90
Killed 0 1 5 206 212 22.25 8.40
Total 0 462 273 218
Percentage 48.48 28.65 22.88
Weighted Percentage
55.00 36.52 8.48
-
62
4.2 Methodology
Injury severity is generally recorded in an ordinal scale. Most commonly, a five-level
scale in the increasing order of injury severity is used: Property-damage only, Possible
Injury, Non Incapacitating Injury, Incapacitating Injury, and Killed (Fatal). Thus, an
ordered-response discrete-choice model is appropriate for the analysis of such data. In
fact, this has been a popular approach for modeling injury severity in general (For
example, Kockelman and Kweon, 2001). In this approach, for any crash n, the observed,
ordinal, injury-severity level