A Merging Model Based on Piecewise Deep Reinforcement Learning for Connected and Autonomous Vehicle in Work Zone under Mixed Autonomy
-
摘要: 针对经典提前合流和延迟合流对动态流量适应性差,以及上游速度差导致合流车辆“错位”问题,研究了基于深度强化学习方法的作业区智能网联车(connected and autonomous vehicle, CAV)分段控制合流模型。通过依次进行车速引导、间距创建和位置对齐,解决换道期多辆封闭车道合流车辆同时申请汇入1个开放车道间距而导致的汇入冲突和效率降低问题。模型将基于柔性演员-评论家算法的纵向轨迹控制与规则的换道决策相结合,共同优化合流轨迹。其中纵向轨迹优化首先选取自车速度与加速度、前车速度与到其距离、相邻车道前后车速度与到其距离、到合流点距离9个特征作为智能体状态,用以刻画自车所处的局部和全局交通状态;其次以降低加速度幅值及其变化率、避免碰撞、创建合流间距、对齐开放车道间距中心、抑制前后车速度差、按推荐速度引导、增加后车让行为目标,分别从舒适、安全、效率角度构建了作业区分段式奖励函数。特别地,基于目标车道后车速度差构建的效率惩罚性函数,解决了混行交通流合流点停车延误多的问题。仿真结果表明:在中、高流量下,与提前合流、延迟合流和新英格兰合流方法相比,本文模型平均车速和最小碰撞时间分别提升了约4.76%和19.71%,进一步加强了作业区行车效率及安全;此外,在含异质人工驾驶车辆的混行交通下,随着CAV市场渗透率的提高,平均车速、最小碰撞时间和合流成功率均呈增大趋势,且均能实现不停车合流。
-
关键词:
- 智能交通 /
- 作业区合流 /
- 合流控制模型 /
- 柔性演员-评论家算法 /
- 混合交通流
Abstract: The classical early and late merge work worse under dynamic demand, and render conflict merging gap due to large speed differences at the upstream. To this end, a piece-wise deep reinforcement learning-based merging model is proposed for connected and autonomous vehicles (CAVs) in work zones under mixed autonomy. Above all, the merging conflicts and efficiency reduction caused by many vehicles in closed lanes trying to merge into one gap on the open lane are addressed by the model with speed guidance, gap creation, and positional alignment. Such a model consists of the soft Actor-Critic algorithm-based longitudinal control and the rule-based lane-changing decision-making. For longitudinal control, 9 features are selected as the agent state to describe surrounding traffic conditions from both local and global views. The mentioned features include the speed and acceleration of the ego vehicle, the speed of and the distance to the lead vehicle, the speed of and the distance to the lead and lag vehicles on the adjacent left lane, and the distance to the merging point. Subsequently, a piecewise reward function for CAVs in the work zone is established by optimizing comfort, safety, and efficiency simultaneously. Such a reward function combines minimizing acceleration and jerk, preventing collisions, generating merging gaps, aligning with the gap center on the open lane, mitigating vehicular speed differences, adhering to advisory speed, and encouraging following vehicles with yield behavior. Particularly, an item of reward function with respect to driving efficiency is shaped on the basis of the speed difference between the lag vehicle on the adjacent lane and the ego vehicle, such that halting of both the CAV and the human-driving vehicle can be alleviated at the merging point. Simulation results illustrate that the proposed model increases by about 4.76% of average speed, and 19.71% of minimal time-to-collision under medium/heavy demand in work zone, in contrast to early merge, late merge and New England merge. In addition, the average speed, minimum time-to-collision, and successful merging rate in mixed autonomy with heterogeneous human-driving vehicles, increase with the increase of the CAV market penetration rate, while all the vehicles merge without halting. -
表 1 智能体状态
Table 1. State of agents
状态特征 含义 vh, te 第h个车辆在t时刻的自车速度 ah, te 第h个车辆在t时刻的自车加速度 ph, te 第h个车辆在t时刻到合流点的距离 vh, tf 第h个车辆在t时刻的前车速度 sh, tf 第h个车辆在t时刻与前车的间距 vh, tlf 第h个车辆在t时刻相邻车道的前车速度 sh, tlf 第h个车辆在t时刻与相邻车道前车的间距 vh, tlb 第h个车辆在t时刻相邻车道的后车速度 sh, tlb 第h个车辆在t时刻与相邻车道后车的间距 表 2 模型参数设置
Table 2. Model parameter settings
参数 取值 参数 取值 折扣系数γ 0.99 纵向碰撞惩罚系数δ 5 经验回放池容量Nrb 648 000 横向碰撞惩罚系数κ 4 样本批量大小Nb 128 位置奖励系数ε 0.3 隐藏层神经元数量nh 256 平滑奖励系数η 1.5 延迟更新步数τstep 3 平稳性惩罚系数λ 0.02 策略网络分布均值μp 0.001 匹配区内期望速度vin /(m/s) 25 策略网络分布标准差σp 0.001 匹配区外期望速度vout /(m/s) 30 Actor学习率lr-actor 0.000 3 最高限速vmax /(m/s) 33.33 Critic学习率lr-critic 0.000 3 最低限速vmin /(m/s) 16.67 车辆安全度参数σv 0.5 封闭车道线性调整参数α 0.125 停车间距sCC0 /m 1.5 开放车道线性调整参数β 0.25 跟驰随机振荡距离sCC2 /m 4 效率换道阈值i /m 30 安全时距sCC1c /s 1.7 间距创建区起点pges /m 450 2倍安全时距sCC1o /s 3.8 位置对齐区起点ppae /m 850 舒适性奖励权重ωa 0.1 位置对齐区终点ppas /m 1 650 安全性奖励权重ωs 0.5 合流区终点pme /m 1 850 效率性奖励权重ωe 0.4 车长L2 /m 5 表 3 异质HDV混合交通流仿真结果
Table 3. Simulation of heterogeneous HDV with mixed autonomy
流量/(pcu/h) CAV渗透率 平均车速/(m/s) 最小TTC/s 合流成功率/% 1 000 0.2 26.15 0.71 97.20 0.4 26.31 1.32 97.70 0.6 26.65 2.26 98.80 0.8 27.38 4.03 99.20 1 500 0.2 23.79 0.61 95.47 0.4 24.82 0.80 96.07 0.6 25.64 1.00 97.93 0.8 26.95 1.41 98.27 2 000 0.2 22.89 0.51 91.65 0.4 25.22 0.53 93.70 0.6 25.78 0.61 95.30 0.8 26.76 0.74 97.25 表 4 合流效果比较
Table 4. Comparison of merging effects
流量/(pcu/h) 合流策略 合流成功率/% 平均车速/(m/s) 最小TTC/s 1 000 EM 100 28.61 6.43 LM 100 29.03 4.01 NEM 100 29.13 5.87 RLM 100 28.69 6.74 1 500 EM 98.89 27.55 1.73 LM 98.51 27.94 1.66 NEM 99.75 28.51 1.98 RLM 100 28.58 2.11 2 000 EM 89.39 25.12 0.76 LM 92.39 24.39 0.88 NEM 98.05 27.58 0.95 RLM 99.35 28.23 1.07 表 5 不同强化学习方法下的模型效果
Table 5. Model performance under different reinforcement learning methods
流量/(pcu/h) 模型 平均车速/(m/s) 最小TTC/s 平均停车数/(次/车道) 1 000 DDPG-M 27.23 6.89 0.05 TD3-M 27.69 6.07 0.01 SAC-NEM 29.13 5.87 0.02 RLM 28.69 6.74 0.00 1 500 DDPG-M 27.21 2.17 0.07 TD3-M 27.63 2.01 0.03 SAC-NEM 28.51 1.98 0.09 RLM 28.58 2.11 0.01 2 000 DDPG-M 26.93 0.75 0.09 TD3-M 27.58 0.72 0.09 SAC-NEM 27.58 0.95 0.31 RLM 28.23 1.07 0.03 表 6 舒适性仿真结果
Table 6. Results of comfort simulation
流量/(pcu/h) 策略 总加权加速度均方根值/(m/s2) 1 500 EM 1.14 LM 1.13 RLM 0.73 2 000 EM 1.26 LM 1.32 RLM 0.781 -
[1] 段克, 马社强, 闫学东, 等. 考虑排队长度的高速公路施工区动态信号控制策略[J]. 北京交通大学学报, 2024, 48(4): 131-140.DUAN K, MA S Q, YAN X D, et al. Dynamic signal control strategy for freeway work zones considering queue length[J]. Journal of Beijing Jiaotong University, 2024, 48(4): 131-140. (in Chinese) [2] 秦严严, 罗钦中, 贺正冰. 网联自动驾驶车辆混合交通流专用道管控方法[J]. 交通运输工程学报, 2023, 23(3): 221-231.QIN Y Y, LUO Q Z, HE Z B. Control method for dedicated lanes in mixed traffic flow of connected and autonomous vehicles[J]. Journal of Traffic and Transportation Engineering, 2023, 23(3): 221-231. (in Chinese) [3] WANG J, GONG S, PEETA S, et al. A real-time deployable model predictive control-based cooperative platooning approach for connected and autonomous vehicles[J]. Transportation Research Part B: Methodological, 2019, 128: 271-301. doi: 10.1016/j.trb.2019.08.002 [4] TORRE FL, DOMENICHINI L, NOCENTINI A. Effects of stationary work zones on motorway crashes[J]. Safety Science, 2017, 92: 148-159. doi: 10.1016/j.ssci.2016.10.008 [5] DEHMAN A, FAROOQ B. Are work zones and connected automated vehicles ready for a harmonious coexistence? A scoping review and research agenda[J]. Transportation Research Part C: Emerging Technologies, 2021, 133: 103422. doi: 10.1016/j.trc.2021.103422 [6] YUAN Y, LIU Y, LIU W. Dynamic lane-based signal merge control for freeway work zone operations[J]. Journal of Transportation Engineering, Part A: Systems, 2019, 145(12): 04019053. doi: 10.1061/JTEPBS.0000256 [7] 孟祥海, 张龙钊, 李生龙. 四车道高速公路部分占用超车道交通控制区交通特性及通行能力研究[J]. 交通运输系统工程与信息, 2020, 20(2): 218-224.MENG H X, ZHANG L Z, LI S L. Research on traffic characteristics and capacity of partially occupied overtaking lanes in four-lane highways[J]. Journal of Transportation Systems Engineering and Information Technology, 2020, 20(2): 218-224. (in Chinese) [8] QI Y, ZHAO Q. Safety impacts of signalized lane merge control at highway work zones[J]. Transportation Planning and Technology, 2017, 40(5): 577-591. doi: 10.1080/03081060.2017.1314499 [9] 陈卫霞, 郑俞, 孟祥海, 等. 双向四车道高速公路超车道封闭施工作业区交通运行特性研究[J]. 公路, 2020, 67(6): 229-237.CHEN W X, ZHENG Y, MENG X H, et al. Research on traffic operation characteristics of closed construction working area of dual 2-lanes expressway[J]. Highway, 2022, 67(6): 229-237. [10] 李春, 吴志周, 曾广, 等. 合流区智能网联汽车协同控制方法综述[J]. 计算机工程与应用, 2024, 60(12): 1-17.LI C, WU Z Z, ZENG G, et al. Review of cooperative control methods for connected and autonomous vehicles in merging areas[J]. Computer Engineering and Applications, 2024, 60(12): 1-17. (in Chinese) [11] 杨澜, 赵祥模, 吴国垣, 等. 智能网联汽车协同生态驾驶策略综述[J]. 交通运输工程学报, 2020, 20(5): 58-72.YANG L, ZHAO X M, WU G Y, et al. Review of cooperative eco-driving strategies for connected and autonomous vehicles[J]. Journal of Transportation Engineering, 2020, 20(5): 58-72. (in Chinese) [12] HU X, SUN J. Trajectory optimization of connected and autonomous vehicles at a multilane freeway merging area[J]. Transportation Research Part C: Emerging Technologies, 2019, 101: 111-125. doi: 10.1016/j.trc.2019.02.016 [13] REN T, XIE Y, JIANG L. Cooperative highway work zone merge control based on reinforcement learning in a connected and automated environment[J]. Transportation Research Record: Journal of the Transportation Research Board, 2020, 2674(10): 363-374. doi: 10.1177/0361198120935873 [14] 陈玲娟, 张思琦, 马东方. 施工区混行车流跟驰及换道模型研究[J]. 交通运输系统工程与信息, 2021, 21(2): 58-64.CHEN L J, ZHANG S Q, MA D F. Research on mixed traffic flow car-following and lane changing models in work zones[J]. Journal of Transportation Systems Engineering and Information Technology, 2021, 21(2): 58-64. (in Chinese) [15] 胡笳, 安连华, 李欣. 面向新型混合交通流的快速路合流区通行能力建模[J]. 交通信息与安全, 2021, 39(1): 137-144. doi: 10.3963/jssn.1674-4861.2021.01.016HU J, AN L H, LI X. Capacity modeling of freeway merging areas for new mixed traffic flows[J]. Journal of Transportation Information and Safety, 2021, 39(1): 137-144. (in Chinese) doi: 10.3963/jssn.1674-4861.2021.01.016 [16] 韩磊, 张轮, 郭为安. 混合交通流环境下基于MSIF-DRL的网联自动驾驶车辆换道决策模型[J]. 北京交通大学学报, 2023, 47(5): 148-161.HAN L, ZHANG L, GUO W A. Lane-changing decision model for connected autonomous vehicles in mixed traffic flow based on MSIF-DRL[J]. Journal of Beijing Jiaotong University, 2023, 47(5): 148-161. (in Chinese) [17] 郝威, 龚雅馨, 张兆磊, 等. 面向高速公路混合交通流的车辆协同合流策略[J]. 交通运输系统工程与信息, 2023, 23(1): 224-235.HAO W, GONG Y X, ZHANG Z LA, et al. Vehicle cooperative merging strategy for mixed traffic flow on highways[J]. Journal of Transportation Systems Engineering and Information Technology, 2023, 23(1): 224-235. (in Chinese) [18] 过秀成, 肖哲, 张一鸣, 等. 考虑智能网联车辆影响的八车道高速公路施工区可变限速控制方法[J]. 东南大学学报(自然科学版), 2024, 54(2): 353-359.GUO X C, XIAO Z, ZHANG Y M, et al. Variable speed limit control method in work zone area of eight-lane highway considering effects of connected automated vehicles[J]. Journal of Southeast University (Natural Science Edition), 2024, 54(2): 353-359 [19] HOU G, CHEN S. Study of work zone traffic safety under adverse driving conditions with a microscopic traffic simulation approach-ScienceDirect[J]. Accident Analysis & Prevention, 2020. 145: 105698. [20] VIRDI N, GRZYBOWSKA H, WALLER S T, et al. A safety assessment of mixed fleets with connected and autonomous vehicles using the surrogate safety assessment module[J]. Accident Analysis & Prevention, 2019, 131: 95-111. -