A Method of Deep Reinforcement Learning-based Ramp Metering for Mainline-ramp Coordination
-
摘要: 高速公路匝道合流区是交通拥堵和事故频发的重要区域,为提升传统匝道控制算法在响应速度和控制精度方面的性能,研究了1种基于强化学习的匝道控制方法。将匝道控制问题转化为马尔可夫决策过程,使用离散信号灯相位设计动作空间提高训练效率,构建涵盖主线和匝道运行状态的状态空间和多维奖励函数。在状态感知层面添加实时交通检测机制并在动作输出时添加最小相位持续约束避免高频相位切换,同时在训练过程中使用优先经验回放提高模型性能。此外,为提升算法在复杂交通环境下的收敛速度与泛化能力,对深度网络结构进行了优化设计,引入了残差连接和层归一化,构建了轻量且高效的多层感知网络。使用微观仿真平台进行了系统性实验,验证所提方法的控制效果。结果表明:所提出的主线匝道协调的匝道控制方法相较于无控制场景系统吞吐量提升了52.67%,平均旅行时间减少了58.21%;并且在本文方法的控制下主线和匝道的通行效率显著上升。将所提出的方法部署于杭徽高速杭州西至於潜互通段入口限流工程案例中,对该路段的路网结构与交通流特征进行了完整还原。结果表明:路网在途量和主线平均速度相较于无管控场景都有所提升,并且车速波动相比无管控场景更加缓和,具备较高的工程部署潜力。
-
关键词:
- 交通工程 /
- 匝道控制 /
- DDQN-WRTD算法 /
- 深度强化学习 /
- 仿真验证
Abstract: The merging area of expressway ramps is prone to traffic congestion and frequent accidents. To improve the performance of traditional ramp metering algorithms in terms of response speed and control accuracy, a ramp metering method based on reinforcement learning is studied. The ramp metering problem is formulated as a Markov decision process. The action space is designed using discrete signal phases to improve training efficiency. A state space and a multi-dimensional reward function are constructed to represent the operating states of the mainline and ramps. At the state perception level, a real time traffic detection mechanism is incorporated. To avoid high frequency phase switching, a minimum phase duration constraint is imposed on action outputs. Meanwhile, prioritized experience replay is used during the training process to enhance the model performance. Furthermore, the deep network structure is optimized to improve convergence speed and generalization in complex traffic environments. Residual connections and layer normalization are introduced to construct a lightweight and efficient multi-layer perception network. A microscopic simulation platform is used to conduct systematic experiments to verify the control effect of the proposed method. The results show that compared with the no-control scenario, the system throughput increased by 52.67% under the proposed mainline-ramp coordinated control. Meanwhile, the average travel time decreases by 58.21% under the proposed method. Moreover, traffic efficiency on the mainline and ramps improves significantly under the proposed method. The proposed method is deployed in the entrance traffic limiting project of the section from Hangzhou West to Yuqian Interchange on the Hangzhou-Huizhou Expressway. The road network structure and traffic flow characteristics of this section are accurately reproduced. The results indicate that network vehicle numbers and mainline average speed increase, while speed fluctuation is more moderate. These improvements demonstrate that the proposed method has high potential for engineering deployment. -
表 1 仿真实验超参数设置
Table 1. Simulation experiment parameter settings
超参数名称 参数值 折扣因子γ 0.75 学习率α 0.001 批大小batch_size 64 经验回放容量memory_size 50 000 训练轮次episode 200 表 2 主线交通参数统计
Table 2. Statistics of mainline traffic parameters
方法 主线 吞吐量/(veh/h)与无管控情况下对比变化率/% 平均行程时间/s与无管控情况下对比变化率/% 无管控 1 713 222.195 ALINEA 2 834 (+65.44) 94.062 (-57.67) ML 2 791 (+62.93) 99.546 (-55.20) MPC-RL 3 122 (+82.25) 70.458 (-68.29) DDQN 3 235 (+88.85) 65.5457 (-70.50) DDQN-WRTD 3 053 (+78.23) 75.831 (-65.87) 表 3 匝道交通参数统计
Table 3. Statistics of ramp traffic parameters
方法 匝道 吞吐量/(veh/h)与无管控情况下对比变化率/% 平均行程时间/s与无管控情况下对比变化率/% 无管控 438 263.003 ALINEA 230 (-47.49) 375.413 (+42.74) ML 228 (-47.95) 383.483 (+45.82) MPC-RL 207 (-52.74) 403.673 (+53.49) DDQN 203 (-58.31) 445.500 (+69.38) DDQN-WRTD (本文) 231 (-47.26) 367.128 (+39.59) 表 4 整体交通流交通参数统计
Table 4. Statistics of overall traffic flow parameters
方法 整体交通流 吞吐量/(veh/h)与无管控情况下对比变化率/% 平均行程时间/s与无管控情况下对比变化率/% 无管控 2 151 230.505 ALINEA 3 064 (+42.45) 115.181 (-50.03) ML 3 019 (+40.35) 120.989 (-47.51) MPC-RL 3 329 (+54.77) 82.403 (-64.25) DDQN 3 438 (+59.83) 87.980 (-61.83) DDQN-WRTD (本文) 3 284 (+52.67) 96.321 (-58.21) -
[1] ZHU J, TASIC I. Safety analysis of freeway on-ramp merging with the presence of autonomous vehicles[J]. Accident Analysis & Prevention, 2021, 152: 105966. [2] 薛行健, 宋睿, 晏克非. 城市快速路匝道合流区拥阻机理及对策分析[J]. 中南林业科技大学学报, 2011, 31(9): 152-159.XUE X J, SONG R, YAN K F. Congestion mechanism and countermeasures of ramp merging areas on urban expressways[J]. Journal of Central South University of Forestry & Technology, 2011, 31(9): 152-159. (in Chinese) [3] LUAIBI W K, LEONG L V, AL-JAMEEL H A. Review on the main characteristics of freeway merging section[C]. AWAM International Conference on Civil Engineering, Singapore: Springer Nature Singapore, 2022. [4] GRZYBOWSKA H, WIJAYARATNA K, SHAFIEI S, et al. Ramp metering strategy implementation: a case study review[J]. Journal of Transportation Engineering, Part A: Systems, 2022, 148(5): 03122002. doi: 10.1061/JTEPBS.0000641 [5] PAPAGEORGIOU M, KOTSIALOS A. Freeway ramp metering: an overview[J]. IEEE Transactions on Intelligent Transportation Systems, 2003, 3(4): 271-281. [6] HADJ-SALEM H, BLOSSEVILLE J M, PAPAGEORGIOU M. ALINEA: a local feedback control law for on-ramp metering; a real-life study[C]. Third International Conference on Road Traffic Control, London: IET, 1994. [7] SMARAGDIS E, PAPAGEORGIOU M, KOSMATOPOULOS E. A flow-maximizing adaptive local ramp metering strategy[J]. Transportation Research Part B: Methodological, 2004, 38(3): 251-270. doi: 10.1016/S0191-2615(03)00012-2 [8] ABUAMER I M, SADAT M, TAMPÈRE C M J. A comparative evaluation of ramp metering controllers ALINEA and PI-ALINEA[C]. 2018 International Conference on Computational and Characterization Techniques in Engineering & Sciences(CCTES), Piscataway: IEEE, 2018. [9] PAPAMICHAIL I, KOTSIALOS A, MARGONIS I, et al. Coordinated ramp metering for freeway networks-a modelpredictive hierarchical control approach[J]. Transportation Research Part C: Emerging Technologies, 2010, 18(3): 311-331. doi: 10.1016/j.trc.2008.11.002 [10] 罗孝羚, 蒋阳升. 智能网联车环境下高速匝道汇入车流轨迹优化模型[J]. 交通运输系统工程与信息, 2019, 19(4): 94-100.LUO X L, JIANG Y S. Trajectory optimization model for freeway on-ramp merging traffic under connected and automated vehicle environment[J]. Journal of Transportation Systems Engineering and Information Technology, 2019, 19(4): 94-100. (in Chinese) [11] 张茂帅, 侯忠生. 带有迭代学习外环的快速路入口匝道无模型自适应预测控制[J]. 控制理论与应用, 2023, 40(5): 781-791.ZHANG M S, HOU Z S. Model-free adaptive predictive control for expressway on-ramp with an iterative learning outer loop[J]. Control Theory & Applications, 2023, 40(5): 781-791. (in Chinese) [12] 乔良, 鲍泓, 玄祖兴, 等. 基于强化学习的无人驾驶匝道汇入模型[J]. 计算机工程, 2018, 44(7): 20-24, 31.QIAO L, BAO H, XUAN Z X, et al. Autonomous vehicle ramp merging model based on reinforcement learning[J]. Computer Engineering, 2018, 44(7): 20-24, 31. (in Chinese) [13] LIU J, ZHAO W, XU C. An efficient on-ramp merging strategy for connected and automated vehicles in multi-lane traffic[J]. IEEE Transactions on Intelligent Transportation Systems, 2021, 23(6): 5056-5067. [14] MAHABAL C, FANG H, WANG H. On-ramp merging for connected autonomous vehicles using deep reinforcement learning[C]. 2022 IEEE International Conferences on Internet of Things(iThings)and IEEE Green Computing & Communications(GreenCom)and IEEE Cyber, Physical & Social Computing(CPSCom)and IEEE Smart Data(SmartData)and IEEE Congress on Cybermatics(Cybermatics), Piscataway: IEEE, 2022. [15] 赵晓华, 刘畅, 亓航, 等. 高速公路交通事故影响因素及异质性分析[J]. 吉林大学学报(工学版), 2024, 54(4): 987-995.ZHAO X H, LIU C, QI H, et al. Influencing factors and heterogeneity analysis of expressway traffic accidents[J]. Journal of Jilin University(Engineering and Technology Edition), 2024, 54(4): 987-995. (in Chinese) [16] ZHOU W, YANG M, LEE M, et al. Q-learning-based coordinated variable speed limit and hard shoulder running control strategy to reduce travel time at freeway corridor[J]. Transportation Research Record, 2020, 2674(11): 915-925. doi: 10.1177/0361198120949875 [17] CHENG Y, CHEN Y Y, CHANG G L. Real-time arterial-friendly ramp metering system[J]. Transportation Research Record, 2022, 2676(6): 217-235. doi: 10.1177/03611981221074366 [18] JEON S, JUNG I. Coordinated ramp metering for minimum waiting time and limited ramp storage[J]. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, 2016, 99(10): 1843-1855. [19] 徐倩. 基于深度强化学习的高速公路匝道控制算法及仿真平台研究与实现[D]. 西安: 长安大学, 2023.XU Q. Research and implementation of freeway ramp control algorithms and simulation platform based on deep reinforcement learning[D]. Xi'an: Chang'an University, 2023. (in Chinese) [20] ZHANG C, ZHANG X, XU X, et al. Safe reinforcement learning and its applications in robotics: a survey[J]. IET Control Theory & Applications, 2023, 40(12): 2090-2103. [21] 中华人民共和国交通运输部. 公路工程技术标准: [S]. 北京: 人民交通出版社股份有限公司, 2014.Ministry of Transport of the People's Republic of China. Technical standards for highway engineering[S]. Beijing: China Communications Press Co., Ltd., 2014. (in Chinese) [22] TREIBER M, KESTING A. An open-source microscopic traffic simulator[J]. IEEE Intelligent Transportation Systems Magazine, 2010, 2(3): 6-13. doi: 10.1109/MITS.2010.939208 [23] GHANBARTEHRANI S, SANANDAJI A, MOKHTARI Z, et al. A novel ramp metering approach based on machine learning and historical data[J]. Machine Learning and Knowledge Extraction, 2020, 2(4): 21. [24] AIRALDI F, DE SCHUTTER B, DABIRI A. Reinforcement learning with model predictive control for highway ramp metering[J]. IEEE Transactions on Intelligent Transportation Systems, 2025, 26(5): 5988-6004. doi: 10.1109/TITS.2025.3549227 -
下载: