留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

面向主线匝道协调的深度强化学习匝道控制方法

张玉杰 唐浩铜 徐倩 姚进强 熊辉 徐志刚

张玉杰, 唐浩铜, 徐倩, 姚进强, 熊辉, 徐志刚. 面向主线匝道协调的深度强化学习匝道控制方法[J]. 交通信息与安全, 2025, 43(6): 86-97. doi: 10.3963/j.jssn.1674-4861.2025.06.009
引用本文: 张玉杰, 唐浩铜, 徐倩, 姚进强, 熊辉, 徐志刚. 面向主线匝道协调的深度强化学习匝道控制方法[J]. 交通信息与安全, 2025, 43(6): 86-97. doi: 10.3963/j.jssn.1674-4861.2025.06.009
ZHANG Yujie, TANG Haotong, XU Qian, YAO Jinqiang, XIONG Hui, XU Zhigang. A Method of Deep Reinforcement Learning-based Ramp Metering for Mainline-ramp Coordination[J]. Journal of Transport Information and Safety, 2025, 43(6): 86-97. doi: 10.3963/j.jssn.1674-4861.2025.06.009
Citation: ZHANG Yujie, TANG Haotong, XU Qian, YAO Jinqiang, XIONG Hui, XU Zhigang. A Method of Deep Reinforcement Learning-based Ramp Metering for Mainline-ramp Coordination[J]. Journal of Transport Information and Safety, 2025, 43(6): 86-97. doi: 10.3963/j.jssn.1674-4861.2025.06.009

面向主线匝道协调的深度强化学习匝道控制方法

doi: 10.3963/j.jssn.1674-4861.2025.06.009
基金项目: 

国家自然科学基金重点项目 52432013

浙江省交通集团技术研发总院有限责任公司科技计划项目 202303

详细信息
    作者简介:

    张玉杰(1982—),硕士研究生,正高级工程师. 研究方向:智慧交通、交通基础设施智能运维、桥梁与隧道工程. E-mail:yj_zhang@tongji.edu.cn

    通讯作者:

    徐志刚(1979—),博士,教授. 研究方向:车联网与自动驾驶、车路协同等. E-mail:xuzhigang@chd.edu.cn

  • 中图分类号: U491

A Method of Deep Reinforcement Learning-based Ramp Metering for Mainline-ramp Coordination

  • 摘要: 高速公路匝道合流区是交通拥堵和事故频发的重要区域,为提升传统匝道控制算法在响应速度和控制精度方面的性能,研究了1种基于强化学习的匝道控制方法。将匝道控制问题转化为马尔可夫决策过程,使用离散信号灯相位设计动作空间提高训练效率,构建涵盖主线和匝道运行状态的状态空间和多维奖励函数。在状态感知层面添加实时交通检测机制并在动作输出时添加最小相位持续约束避免高频相位切换,同时在训练过程中使用优先经验回放提高模型性能。此外,为提升算法在复杂交通环境下的收敛速度与泛化能力,对深度网络结构进行了优化设计,引入了残差连接和层归一化,构建了轻量且高效的多层感知网络。使用微观仿真平台进行了系统性实验,验证所提方法的控制效果。结果表明:所提出的主线匝道协调的匝道控制方法相较于无控制场景系统吞吐量提升了52.67%,平均旅行时间减少了58.21%;并且在本文方法的控制下主线和匝道的通行效率显著上升。将所提出的方法部署于杭徽高速杭州西至於潜互通段入口限流工程案例中,对该路段的路网结构与交通流特征进行了完整还原。结果表明:路网在途量和主线平均速度相较于无管控场景都有所提升,并且车速波动相比无管控场景更加缓和,具备较高的工程部署潜力。

     

  • 图  1  匝道控制系统示意图

    Figure  1.  Schematic diagram of ramp metering system

    图  2  状态空间编码

    Figure  2.  State space coding

    图  3  奖励函数计算流程图

    Figure  3.  Flowchart of reward calculation

    图  4  DDQN-WRTD算法结构

    Figure  4.  DDQN-WRTD algorithm structure

    图  5  深度网络结构

    Figure  5.  Deep network structure

    图  6  仿真平台

    Figure  6.  Simulation platform

    图  7  模型训练情况

    Figure  7.  Model training

    图  8  上行与下行交通流

    Figure  8.  Upstream and downstream traffic flow

    图  9  累计车辆数和不同时间下的平均旅行时间

    Figure  9.  Total number of vehicles and average travel time at different times

    图  10  仿真结果指标对比

    Figure  10.  Simulation result index comparison

    表  1  仿真实验超参数设置

    Table  1.   Simulation experiment parameter settings

    超参数名称 参数值
    折扣因子γ 0.75
    学习率α 0.001
    批大小batch_size 64
    经验回放容量memory_size 50 000
    训练轮次episode 200
    下载: 导出CSV

    表  2  主线交通参数统计

    Table  2.   Statistics of mainline traffic parameters

    方法 主线
    吞吐量/(veh/h)与无管控情况下对比变化率/% 平均行程时间/s与无管控情况下对比变化率/%
    无管控 1 713 222.195
    ALINEA 2 834 (+65.44) 94.062 (-57.67)
    ML 2 791 (+62.93) 99.546 (-55.20)
    MPC-RL 3 122 (+82.25) 70.458 (-68.29)
    DDQN 3 235 (+88.85) 65.5457 (-70.50)
    DDQN-WRTD 3 053 (+78.23) 75.831 (-65.87)
    下载: 导出CSV

    表  3  匝道交通参数统计

    Table  3.   Statistics of ramp traffic parameters

    方法 匝道
    吞吐量/(veh/h)与无管控情况下对比变化率/% 平均行程时间/s与无管控情况下对比变化率/%
    无管控 438 263.003
    ALINEA 230 (-47.49) 375.413 (+42.74)
    ML 228 (-47.95) 383.483 (+45.82)
    MPC-RL 207 (-52.74) 403.673 (+53.49)
    DDQN 203 (-58.31) 445.500 (+69.38)
    DDQN-WRTD (本文) 231 (-47.26) 367.128 (+39.59)
    下载: 导出CSV

    表  4  整体交通流交通参数统计

    Table  4.   Statistics of overall traffic flow parameters

    方法 整体交通流
    吞吐量/(veh/h)与无管控情况下对比变化率/% 平均行程时间/s与无管控情况下对比变化率/%
    无管控 2 151 230.505
    ALINEA 3 064 (+42.45) 115.181 (-50.03)
    ML 3 019 (+40.35) 120.989 (-47.51)
    MPC-RL 3 329 (+54.77) 82.403 (-64.25)
    DDQN 3 438 (+59.83) 87.980 (-61.83)
    DDQN-WRTD (本文) 3 284 (+52.67) 96.321 (-58.21)
    下载: 导出CSV
  • [1] ZHU J, TASIC I. Safety analysis of freeway on-ramp merging with the presence of autonomous vehicles[J]. Accident Analysis & Prevention, 2021, 152: 105966.
    [2] 薛行健, 宋睿, 晏克非. 城市快速路匝道合流区拥阻机理及对策分析[J]. 中南林业科技大学学报, 2011, 31(9): 152-159.

    XUE X J, SONG R, YAN K F. Congestion mechanism and countermeasures of ramp merging areas on urban expressways[J]. Journal of Central South University of Forestry & Technology, 2011, 31(9): 152-159. (in Chinese)
    [3] LUAIBI W K, LEONG L V, AL-JAMEEL H A. Review on the main characteristics of freeway merging section[C]. AWAM International Conference on Civil Engineering, Singapore: Springer Nature Singapore, 2022.
    [4] GRZYBOWSKA H, WIJAYARATNA K, SHAFIEI S, et al. Ramp metering strategy implementation: a case study review[J]. Journal of Transportation Engineering, Part A: Systems, 2022, 148(5): 03122002. doi: 10.1061/JTEPBS.0000641
    [5] PAPAGEORGIOU M, KOTSIALOS A. Freeway ramp metering: an overview[J]. IEEE Transactions on Intelligent Transportation Systems, 2003, 3(4): 271-281.
    [6] HADJ-SALEM H, BLOSSEVILLE J M, PAPAGEORGIOU M. ALINEA: a local feedback control law for on-ramp metering; a real-life study[C]. Third International Conference on Road Traffic Control, London: IET, 1994.
    [7] SMARAGDIS E, PAPAGEORGIOU M, KOSMATOPOULOS E. A flow-maximizing adaptive local ramp metering strategy[J]. Transportation Research Part B: Methodological, 2004, 38(3): 251-270. doi: 10.1016/S0191-2615(03)00012-2
    [8] ABUAMER I M, SADAT M, TAMPÈRE C M J. A comparative evaluation of ramp metering controllers ALINEA and PI-ALINEA[C]. 2018 International Conference on Computational and Characterization Techniques in Engineering & Sciences(CCTES), Piscataway: IEEE, 2018.
    [9] PAPAMICHAIL I, KOTSIALOS A, MARGONIS I, et al. Coordinated ramp metering for freeway networks-a modelpredictive hierarchical control approach[J]. Transportation Research Part C: Emerging Technologies, 2010, 18(3): 311-331. doi: 10.1016/j.trc.2008.11.002
    [10] 罗孝羚, 蒋阳升. 智能网联车环境下高速匝道汇入车流轨迹优化模型[J]. 交通运输系统工程与信息, 2019, 19(4): 94-100.

    LUO X L, JIANG Y S. Trajectory optimization model for freeway on-ramp merging traffic under connected and automated vehicle environment[J]. Journal of Transportation Systems Engineering and Information Technology, 2019, 19(4): 94-100. (in Chinese)
    [11] 张茂帅, 侯忠生. 带有迭代学习外环的快速路入口匝道无模型自适应预测控制[J]. 控制理论与应用, 2023, 40(5): 781-791.

    ZHANG M S, HOU Z S. Model-free adaptive predictive control for expressway on-ramp with an iterative learning outer loop[J]. Control Theory & Applications, 2023, 40(5): 781-791. (in Chinese)
    [12] 乔良, 鲍泓, 玄祖兴, 等. 基于强化学习的无人驾驶匝道汇入模型[J]. 计算机工程, 2018, 44(7): 20-24, 31.

    QIAO L, BAO H, XUAN Z X, et al. Autonomous vehicle ramp merging model based on reinforcement learning[J]. Computer Engineering, 2018, 44(7): 20-24, 31. (in Chinese)
    [13] LIU J, ZHAO W, XU C. An efficient on-ramp merging strategy for connected and automated vehicles in multi-lane traffic[J]. IEEE Transactions on Intelligent Transportation Systems, 2021, 23(6): 5056-5067.
    [14] MAHABAL C, FANG H, WANG H. On-ramp merging for connected autonomous vehicles using deep reinforcement learning[C]. 2022 IEEE International Conferences on Internet of Things(iThings)and IEEE Green Computing & Communications(GreenCom)and IEEE Cyber, Physical & Social Computing(CPSCom)and IEEE Smart Data(SmartData)and IEEE Congress on Cybermatics(Cybermatics), Piscataway: IEEE, 2022.
    [15] 赵晓华, 刘畅, 亓航, 等. 高速公路交通事故影响因素及异质性分析[J]. 吉林大学学报(工学版), 2024, 54(4): 987-995.

    ZHAO X H, LIU C, QI H, et al. Influencing factors and heterogeneity analysis of expressway traffic accidents[J]. Journal of Jilin University(Engineering and Technology Edition), 2024, 54(4): 987-995. (in Chinese)
    [16] ZHOU W, YANG M, LEE M, et al. Q-learning-based coordinated variable speed limit and hard shoulder running control strategy to reduce travel time at freeway corridor[J]. Transportation Research Record, 2020, 2674(11): 915-925. doi: 10.1177/0361198120949875
    [17] CHENG Y, CHEN Y Y, CHANG G L. Real-time arterial-friendly ramp metering system[J]. Transportation Research Record, 2022, 2676(6): 217-235. doi: 10.1177/03611981221074366
    [18] JEON S, JUNG I. Coordinated ramp metering for minimum waiting time and limited ramp storage[J]. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, 2016, 99(10): 1843-1855.
    [19] 徐倩. 基于深度强化学习的高速公路匝道控制算法及仿真平台研究与实现[D]. 西安: 长安大学, 2023.

    XU Q. Research and implementation of freeway ramp control algorithms and simulation platform based on deep reinforcement learning[D]. Xi'an: Chang'an University, 2023. (in Chinese)
    [20] ZHANG C, ZHANG X, XU X, et al. Safe reinforcement learning and its applications in robotics: a survey[J]. IET Control Theory & Applications, 2023, 40(12): 2090-2103.
    [21] 中华人民共和国交通运输部. 公路工程技术标准: [S]. 北京: 人民交通出版社股份有限公司, 2014.

    Ministry of Transport of the People's Republic of China. Technical standards for highway engineering[S]. Beijing: China Communications Press Co., Ltd., 2014. (in Chinese)
    [22] TREIBER M, KESTING A. An open-source microscopic traffic simulator[J]. IEEE Intelligent Transportation Systems Magazine, 2010, 2(3): 6-13. doi: 10.1109/MITS.2010.939208
    [23] GHANBARTEHRANI S, SANANDAJI A, MOKHTARI Z, et al. A novel ramp metering approach based on machine learning and historical data[J]. Machine Learning and Knowledge Extraction, 2020, 2(4): 21.
    [24] AIRALDI F, DE SCHUTTER B, DABIRI A. Reinforcement learning with model predictive control for highway ramp metering[J]. IEEE Transactions on Intelligent Transportation Systems, 2025, 26(5): 5988-6004. doi: 10.1109/TITS.2025.3549227
  • 加载中
图(10) / 表(4)
计量
  • 文章访问数:  6
  • HTML全文浏览量:  5
  • PDF下载量:  0
  • 被引次数: 0
出版历程
  • 收稿日期:  2025-07-10
  • 网络出版日期:  2026-03-13

目录

    /

    返回文章
    返回