Volume 43 Issue 5
Oct.  2025
Turn off MathJax
Article Contents
WANG Jianyu, DONG Yue, CHEN Xiantian, ZHAO Pengfei, ZHOU Bei, NA Bo. Coupling Analysis of Causative Factors for Severe Traffic Accidents Considering Sample Imbalance[J]. Journal of Transport Information and Safety, 2025, 43(5): 70-78. doi: 10.3963/j.jssn.1674-4861.2025.05.007
Citation: WANG Jianyu, DONG Yue, CHEN Xiantian, ZHAO Pengfei, ZHOU Bei, NA Bo. Coupling Analysis of Causative Factors for Severe Traffic Accidents Considering Sample Imbalance[J]. Journal of Transport Information and Safety, 2025, 43(5): 70-78. doi: 10.3963/j.jssn.1674-4861.2025.05.007

Coupling Analysis of Causative Factors for Severe Traffic Accidents Considering Sample Imbalance

doi: 10.3963/j.jssn.1674-4861.2025.05.007
  • Received Date: 2024-07-08
    Available Online: 2026-03-05
  • Road traffic accidents occur frequently, yet the data distribution based on traditional accident severity classification is often imbalanced. To explore the coupling effects of multidimensional factors on severe traffic accidents under sample imbalance conditions, this study proposes an analytical framework integrating the Adaptive Synthetic Sampling (ADASYN) algorithm, a Stacking ensemble learning model, and the Apriori algorithm. Utilizing data from the U.S. Department of Transportation's Fatality Analysis Reporting System (FARS) from 2017 to 2021, fifteen potential feature variables are selected across four dimensions—human, vehicle, road, and environment—to analyze the effects of multidimensional factor coupling on the occurrence of severe accidents. The ADASYN algorithm was employed to address sample imbalance. Four classical machine learning models including random forest (RF), categorical boosting (CatBoost), extreme gradient boosting (XGBoost), and gradient boosting decision tree (GBDT), are selected as base learners. Five types of meta-learners, namely logistic regression, Gaussian Na?ve Bayes, support vector machine (SVM), light gradient boosting machine (LightGBM), and multilayer perceptron (MLP), are compared to identify the optimal Stacking ensemble model with the strongest generalization performance. Based on the optimal model, feature importance ranking is obtained to determine key influencing factors, followed by the application of the Apriori algorithm for multidimensional coupling analysis, which explored the impact of five-dimensional factor coupling on the rate of severe accidents. The results indicate that: ①The Stacking ensemble model composed of Logistic Regression as the meta-learner and RF, CatBoost, XGBoost, and GBDT as base learners achieved the best overall performance, with a recall of 0.80; ②The five factors of road type, season, collision type, lighting conditions at the time of the collision, and driver alcohol consumption, accounted for 53.2% of the total importance of all factors, which is substantially higher than that of the other variables. Among them, collisions involving"impact with trees or other pole-like objects"exhibited the highest severe accident rate at 86.2%, and the severe accident rate under illuminated conditions is 13.5% higher than under non-illuminated conditions; ③ Multidimensional factor coupling analysis reveals that the probability of severe crashes is highest when multiple factors coexist: municipal roads, sober drivers, transitions between unlit and lit lighting conditions at the time of the collision, and the autumn season. Under this coupled condition, the confidence level reaches 89.0%, challenging the conventional perception that non-drinking is a low-risk factor.

     

  • loading
  • [1]
    国家统计局. 中国统计年鉴[M]. 北京: 中国统计出版社, 2023.

    National Bureau of Statistics of China. China statistical yearbook[M]. Beijing: China Statistics Press, 2023. (in Chinese)
    [2]
    王朝健, 张道文, 蒋骏, 等. 考虑数据不平衡的城市道路乘用车致命事故率分析[J]. 交通信息与安全, 2023, 41(5): 43-53. doi: 10.3963/j.jssn.1674-4861.2023.05.005

    WANG C J, ZHANG D W, JIANG J, et al. An analysis of fatal accident rates of passenger cars on urban roads considering imbalanced data samples[J]. Journal of Transport Information and Safety, 2023, 41(5): 43-53. (in Chinese) doi: 10.3963/j.jssn.1674-4861.2023.05.005
    [3]
    CHEN P, ZHANG Z, HUANG Y, et al. Risk assessment of marine accidents with fuzzy bayesian networks and causal analysis[J]. Ocean & Coastal Management, 2022, 228: 106323.
    [4]
    DHANOA K K, TIWARI G, MANOJ M. Modeling fatal traffic accident occurrences in small Indian cities, patiala, and rajpura[J]. International Journal of Injury Control and Safety Promotion, 2019, 26(3): 225-232. doi: 10.1080/17457300.2019.1625413
    [5]
    张道文, 母尧尧, 王朝健, 等. 城市道路交通事故特性及严重程度研究[J]. 安全与环境学报, 2022, 22(2): 599-605.

    ZHANG D W, MU Y Y, WANG C J, et al. Researchon characteristics and severity of urban road traffic accidents[J]. Journal of Safety and Environment, 2022, 22(2): 599-605. (in Chinese)
    [6]
    WEI Z H, ZHANG Y L, DAS S. Applying explainable machine learning techniques in daily crash occurrence and severity modeling for rural interstates[J]. Transportation Research Record, 2023, 2677(5): 611-628. doi: 10.1177/03611981221134629
    [7]
    单永航, 张希, 胡川, 等. 基于集成学习的交通事故严重程度预测研究与应用[J]. 计算机工程, 2024, 50(2): 33-42.

    DAN Y H, ZHANG X, HU C, et al. Traffic accident severity prediction research and application based on ensemble learning[J]. Computer Engineering, 2024, 50(2): 33-42. (in Chinese)
    [8]
    陈坚, 邱智宣, 彭涛, 等. 建成环境对城市交通事故严重程度影响研究[J]. 重庆交通大学学报(自然科学版), 2023, 42 (3): 105-111, 150.

    CHEN J, QIU Z X, PENG T, et al. Influence of built environment on the severity of urban traffic accidents[J]. Journal of Chongqing Jiaotong University(Natural Sciences), 2023, 42 (3): 105-111, 150. (in Chinese)
    [9]
    XIAO J L. SVM and KNN ensemble learning for traffic incident detection[J]. Physica A: Statistical Mechanics and its Applications, 2019, 517: 29-35. doi: 10.1016/j.physa.2018.10.060
    [10]
    郑明. 面向不平衡数据的重采样方法研究[D]. 昆明: 云南大学, 2020.

    ZHENG M. Resampling methods for imbalanced data[D]. Kunming: YunnanUniversity, 2020. (in Chinese)
    [11]
    覃勤, 李靖, 卢锋. 基于N-K模型长大下坡路段安全风险研究[J]. 公路, 2024, 69(1): 276-281.

    QIN Q, LI J, LU F. Safety riskanalysis of long downhill sections based on the N-K model[J]. Highway, 2024, 69(1): 276-281. (in Chinese)
    [12]
    靳文舟, 姚尹杰. 多因素耦合作用下的车辆群事故伤害程度估计[J]. 郑州大学学报(工学版), 2021, 42(3): 1-7.

    JIN W Z, YAO Y J. Estimation of accident injury severity of vehicle groups considering multi-factorcoupling[J]. Journal of Zhengzhou University(Engineering Science), 2021, 42 (3): 1-7. (in Chinese)
    [13]
    王占中, 张书源, 杨萌, 等. 交通事故致因知识图谱构建及风险因素挖掘[J]. 同济大学学报(自然科学版), 2025, 53 (4): 611-618.

    WANG Z Z, ZHANG S Y, YANG M, et al. Traffic accident causation knowledge graph construction and risk factor mining[J]. Journal of Tongji University(Natural Science), 2025, 53(4): 611-618. (in Chinese)
    [14]
    胡伟涛, 诸葛业琴, 李晓欢, 等. 营运车辆的事故严重程度预测及其风险因素耦合关系研究[J]. 桂林电子科技大学学报, 2025, 41(2): 1-9.

    HU W T, ZHUGE Y Q, LI X H, et al. Prediction of accident severity and study on coupling relationship of risk factors for commercial vehiclesl[J]. Journal of Guilin University of Electronic Technology, 2025, 41(2): 1-9. (in Chinese)
    [15]
    魏泽平, 刘淼淼, 张学驰. 高速公路交通事故影响因素分析及防控策略[J]. 交通技术, 2022, 11(2): 59-73.

    WEI Z P, LIU M M, ZHANG X C. Analysis of factors affecting expressway traffic accident and preventive measure[J]. Open Journal of Transportation Technologies, 2022, 11(2): 59-73. (in Chinese)
    [16]
    乔剑锋, 王亚楠, 吕淑然, 等. 基于K-means和LCA的自动驾驶交通事故聚类分析[J]. 中国安全科学学报, 2025, 35 (7): 192-200.

    QIAO J F, WANG Y N, LYU S R, et al. Cluster analysis of autonomous driving traffic accidents based on K-means and LCA[J]. China Safety Science Journal, 2025, 35(7): 192-200. (in Chinese)
    [17]
    熊睿, 邓院昌. 疲劳驾驶交通事故的严重程度影响因素分析[J]. 中国安全生产科学技术, 2022, 18(4): 20-26.

    XIONG R, DENG Y C. Analysis on factors affecting severity of traffic accidents caused by fatigue driving[J]. Journal of Safety Science and Technology, 2022, 18(4): 20-26. (in Chinese)
    [18]
    胡立伟, 赵雪亭, 杨锦青, 等. 城市快速过境通道衔接节点交通风险耦合致因模型研究[J]. 中国安全生产科学技术, 2019, 15(12): 150-155.

    HU L W, ZHAO X T, YANG J Q, et al. Research on coupling cause model of traffic risk in connectingnodes of urban rapid transit channels[J]. Journal of Safety Science and Technology, 2019, 15(12), 150-155. (in Chinese)
    [19]
    邱文利, 杨海峰, 张少波, 等. 基于改进Apriori算法的高速公路交通事故关联分析[J]. 中外公路, 2024, 44(3): 227-235.

    QIU W L, YANG H F, ZHANG S B, et al. Correlation analysis of highway traffic accidents based on improved apriori algorithm[J]. Journal of China & Foreign Highway, 2024, 44 (3): 227-235. (in Chinese)
    [20]
    吴彪, 王星予, 刘拓, 等. 基于关联分析的城乡结合部交通事故致因识别[J]. 武汉理工大学学报(交通科学与工程版), 2022, 46(6): 948-952.

    WU B, WANG X Y, LIU T, et al. Identification of traffic accidents causation in rural-urban fringe based on correlation analysis[J]. Journal of Wuhan University of Technology (Transportation Science & Engineering), 2022, 46(6): 948-952. (in Chinese)
    [21]
    李美玲, 李子辉, 陈雪珲, 等. 基于关联规则的高速公路交通事故风险识别[J]. 山东建筑大学学报, 2024, 39(3): 99-106.

    LI M L, LI Z H, CHEN X H, et al. Expressway traffic accident risk identification based on association rules[J]. Journal of Shandong Jianzhu University, 2024, 39(3): 99-106. (in Chinese)
    [22]
    马庚华, 郑长江, 邓评心, 等. 关联规则挖掘在道路交通事故分析中的应用[J]. 西华大学学报(自然科学版), 2019, 38 (3): 93-97, 112.

    MA G H, ZHENG C J, DENG P X, et al. Application of association rules mining to traffic accidents analysis[J]. Journal of Xihua University(Natural Science Edition), 2019, 38(3): 93-97, 112. (in Chinese)
    [23]
    杨洋, 王文慧, 吴先宇, 等. 高速公路非常规交通事故研究综述[J]. 应用基础与工程科学学报, 2024, 32(3): 601-626.

    YANG Y, WANG W H, WU X Y, et al. Review of the research toward freeway unconventional traffic accidents[J]. Journal of Basic Science and Engineering, 2024, 32(3): 601-626. (in Chinese)
    [24]
    RIVERA A J, DÁVILA M A, ELIZONDO D, et al. Mldr. resampling: efficient reference implementations of multilabel resampling algorithms[J]. Neurocomputing, 2023, 559: 126806. doi: 10.1016/j.neucom.2023.126806
    [25]
    RODRÍGUEZ N, LÓPEZ D, FERNÁNDEZ A, et al. Soul: scala oversampling and undersampling library for imbalance classification[J]. SoftwareX, 2021, 15: 100767.
    [26]
    王健宇, 陈献天, 焦朋朋, 等. 考虑建成环境的交通事故严重程度致因交互效应研究[J]. 交通运输系统工程与信息, 2024, 24(2): 272-280.

    WANG J Y, CHEN X T, JIAO P P, et al. Interactive effect on traffic accident severity considering built environment[J]. Journal of Transportation Systems Engineering and Information Technology, 2024, 24(2): 272-280. (in Chinese)
    [27]
    周星, 丁立新, 万润泽, 等. 分类器集成算法研究[J]. 武汉大学学报(理学版), 2015, 61(6): 503-508.

    ZHOU X, DING L X, WAN R Z, et al. Research on classifier ensemblealgorithms[J]. Journal of Wuhan University (Natural Science Edition), 2015, 61(6): 503-508. (in Chinese)
    [28]
    黄锦, 王梓豪, 陈曾惠, 等. 基于Apriori关联算法的城市综合体停车需求影响因素关联分析[J]. 福建交通科技, 2024, (3): 81-86.

    HUANG J, WANG Z H, CHEN Z H, et al. Correlation analysis of factors affecting parking demand in urban complexes based on Apriori correlation algorithm[J]. Fujian Traffic Science and Technology, 2024, (3): 81-86. (in Chinese)
    [29]
    陈俊宇, 李金龙, 许伦辉, 等. 基于ADASYN-XGBoost的交通事故自动检测方法[J]. 交通信息与安全, 2023, 41(3): 12-22.

    CHEN J Y, LI J L, XU LH, et al. An automatic detection method for traffic accidents based on ADASYN-XGBoost[J]. ournal of Transport Information and Safety, 2023, 1 (3): 12-22. (in Chinese)
    [30]
    CHEN H Q, CHEN L. Support vector machine classification of drunk driving behaviour[J]. International Journal of Environmental Research and Public Health, 2017, 14: 108. doi: 10.3390/ijerph14010108
    [31]
    MARUYAMA M. Dynamic properties of peak levels of road traffic noise along a freeway[J]. Applied Acoustics, 2020, 160: 107095. doi: 10.1016/j.apacoust.2019.107095
    [32]
    HU X J, QIAO L Q, HAO X T, et al. Research on the impact of entry points on urban arterial roads in the framework of Kerner's three-phase traffic theory[J]. Physica A: Statistical Mechanics and its Applications, 2022, 605: 127962. doi: 10.1016/j.physa.2022.127962
    [33]
    YAN G J, WANG W F, JHANG K M, et al. Association between patients with dementia and high caregiving burden for caregivers from a medical center in Taiwan[J]. Psychology Research and Behavior Management, 2019, 12: 55-65. doi: 10.2147/PRBM.S187676
    [34]
    冯胤伟, 刘正江, 蒋子怡, 等. 基于关联规则挖掘和复杂网络理论的船舶碰撞事故影响因素分析[J]. 大连海事大学学报, 2023, 49(3): 31-44.

    FENG Y W, LIU Z J, JIANG Z Y, et al. Analysis of factors affecting ship collisions based on association rule mining and complex network theory[J]. Journal of Dalian Maritime University, 2023, 49(3): 31-44. (in Chinese)
    [35]
    冯晓锋, 徐硕, 袁军. 基于关联规则的新能源车交通事故致因分析[J]. 中国人民公安大学学报(自然科学版), 2024, 30(1): 37-43.

    FENG X F, XU S, YUAN J. Causal analysis of new energy vehicle traffic accidents based on association rules[J]. Journal of People's Public Security University of China(Science and Technology), 2024, 30(1): 37-43. (in Chinese)
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(2)  / Tables(6)

    Article Metrics

    Article views (22) PDF downloads(6) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return