Publication Details

Causal analysis and classification of traffic crash injury severity using machine learning algorithms

Type: article

Author(s): Chakraborty, Meghna; Gates, Timothy J.; Sinha, Subhrajit

Pages: 12


Publication Date: Aug-2023

Journal: Data Science for Transportation

Volume: 5

Issue: 2

Issn: 2948-135X

Doi: 10.1007/s42421-023-00076-9

Abstract: Objectives Causal analysis and classification of injury severity applying non-parametric methods for traffic crashes have received limited attention. This study presents a methodological framework for causal inference, using Granger causality analysis, and injury severity classification of traffic crashes, occurring on urban interstates in the State of Texas in the United States, with different machine learning techniques including decision trees (DT), random forest (RF), extreme gradient boosting (XGBoost), and deep neural network (DNN). Materials and Methods The data used in this study were obtained for traffic crashes occurring on all urban interstates across the state of Texas for a period of 6 years between 2014 and 2019. The output of the proposed severity classification approach includes three classes; fatal and severe injury (KA) crashes, non-severe and possible injury (BC) crashes, and property damage only (PDO) crashes. While Granger Causality helped identify the most influential factors affecting crash severity, the learning-based models predicted the severity classes with varying performance. Results The results of Granger causality analysis identified predictors including speed limit, surface and weather conditions, traffic volume, presence of work zones, workers in work zones, and high occupancy vehicle lanes, among others, as the most important factors affecting crash severity. The prediction performance of the classifiers yielded varying results across the different classes. Specifically, while decision tree and random forest classifiers provided the greatest performance for PDO and BC severities, respectively, for the KA class, the rarest class in the data, the DNN classifier performed superior to all other algorithms, most likely due to its capability of approximating nonlinear models. In terms of the overall performance, the decision tree classifier predicts about 58 percent, 43 percent, and 15 percent correct severity for PDO, BC, and KA crashes, respectively. Similarly, the random forest classifier correctly predicts the severity of PDO, BC, and KA crashes by 55 percent, 46 percent, and 17 percent respectively. Moreover, the XGBoost classifier correctly predicts the severity of PDO, BC, and KA crashes by 56 percent, 45 percent, and 27 percent, respectively. Lastly, for the deep neural net, the classier accurately predicts the severity of PDO, BC, and KA crashes by 54 percent, 33 percent, and 44 percent, respectively. It should be noted that these percentages stated are all for the reduced order models which provided superior predictions compared to the full models. Conclusions Overall, this study contributes to the limited body of knowledge pertaining to causal analysis and classification prediction of traffic crash injury severity using non-parametric approaches. Clinical Relevance None