基于改进YOLOv5s模型的地铁屏蔽门与列车门间异物快速检测方法

戴愿; 刘伟铭; 王珩; 谢玮; 龙科军

doi:10.3963/j.jssn.1674-4861.2023.02.002

基于改进YOLOv5s模型的地铁屏蔽门与列车门间异物快速检测方法

doi: 10.3963/j.jssn.1674-4861.2023.02.002

戴愿^1,,
刘伟铭^1, ,,
王珩²,
谢玮¹,
龙科军³

1.
华南理工大学土木与交通学院广州 510641
2.
深圳市地铁集团有限公司广东深圳 518026
3.
长沙理工大学交通运输工程学院长沙 410114

基金项目:

国家自然科学基金项目 52172313

详细信息

作者简介:
戴愿(1995—)，博士研究生. 研究方向：交通信息工程及控制. E-mail：ctdaiyuan@mail.scut.edu.cn

通讯作者:
刘伟铭(1963—)，博士，教授. 研究方向：交通信息工程及控制、智能交通等. E-mail：mingweiliu@126.com

中图分类号: U231+.92
计量
- 文章访问数: 1241
- HTML全文浏览量: 523
- PDF下载量: 103
- 被引次数: 0
出版历程
- 收稿日期: 2022-08-29
- 网络出版日期: 2023-06-19

A Method for Timely Detecting Foreign Objects between Metro Platform Screen Doors and Train Doors Based on an Improved YOLOv5s Model

DAI Yuan^1
,,
LIU Weiming^{1
, ,},
WANG Heng²,
XIE Wei¹,
LONG Kejun³

1.
School of Civil Engineering and Transportation, South China University of Technology, Guangzhou 510641, China
2.
Shenzhen Metro Group Co., Ltd., Shenzhen 518026, Guangdong, China
3.
School of Traffic and Transportation Engineering, Changsha University of Science and Technology, Changsha 410114, China

摘要

摘要: 快速准确地检测地铁屏蔽门与列车门间异物对于保障安全具有重要意义。针对当前地铁屏蔽门与列车门间异物检测方法的低效和不准确，提出了1种基于YOLOv5s模型的快速检测方法。由于原始YOLOv5s模型在检测异物时仅依赖于候选区域内部特征信息而忽略了全局语义信息，因此引入全局语义模块来解决这一局限。该模块集成了非局部模块和压缩-激励模块：非局部模块采用自注意力机制建模像素对关系，捕获长局信息依赖；压缩-激励模块则起到降低模型计算量的作用。全局语义模块使得模型能够捕获全局语义信息并将其与局部信息相结合，以实现更好的异物检测，同时不会显著增加计算复杂度。此外，原始YOLOv5s模型中低效的Focus模块被1个完全由标准卷积单元构成的Stem模块所取代，有助于减少模型计算量和提高检测速度。使用桌面级显卡NVIDIA TITAN Xp，在从真实地铁站中采集构建而成的5 854张地铁异物数据集，对模型进行验证，实验结果表明：①改进后的YOLO模型表现显著优于其它基准模型，检测速度达到385帧/s，相比原始YOLOv5s提升100%，相比最快的YOLOv3-SPP提升466%；②改进后的YOLO模型实现了88.5%的检测平均准确率，相比原始YOLOv5s提升0.5%，相比检测平均准确率最高的YOLOv3-SPP提升0.6%；③此外，改进后的YOLO模型仅占用空间14.4 MB的计算机存储空间，相比原始YOLOv5s减少0.7%，相比所占空间最小的SSD减少85%。
- 轨道交通 /
- 智慧地铁 /
- 异物检测 /
- YOLO模型 /
- 注意力机制
Abstract: Accurately and efficiently detecting foreign objects between platform screen doors (PSDs) and train doors at metro stations is of great significance for safety purpose. In response to the inefficiency and inaccuracy of current detection methods, a method based on the you-only-look-once (YOLOv5s) model is proposed. As the original YOLOv5s model relies on internal features of candidate regions but not global contextual information, a global context module is introduced to address the limitation. This module integrates non-local modules and squeeze-excitation modules. The non-local modules use self-attention mechanism to model relationships between pixels and capture long-term dependencies. The squeeze-excitation modules is developed to reduce the computational cost of the model. The global context module enables the model to capture global contextual information and combines it with local information for improved detection of foreign objects without significantly increasing computational complexity. Additionally, the inefficient Focus module of the original YOLOv5s is replaced with a Stem module that is fully developed from standard convolutional units, contributing to a reduced computation cost and enhanced detection speed. Experiments are conducted based on a dataset of 5 854 foreign object images collected from metro stations, with the model being tested using desktop-level NVIDIA TITAN Xp graphics cards. The results indicate that ①the improved YOLO model performs remarkably better than other baseline models, exhibiting an impressive detection speed of 385 frames per second, a 100% improvement over the original YOLOv5s model and a substantial 466% improvement over the fastest speed of YOLOv3-SPP model. ② The improved YOLO model achieves an average detection accuracy of 88.5%, a 0.5% improvement over the original YOLOv5s and a 0.6% improvement over the highest average detection accuracy of YOLOv3-SPP. ③ The improved YOLO model takes up only 14.4 MB of computer storage space, which is 0.7% less than the original YOLOv5s, and 85% less than the single shot multibox detector (SSD) that takes the least storage space.
- rail transit /
- smart metro /
- foreign object detection /
- YOLO model /
- attention mechanism

HTML全文

图 1 结合全局上下文信息和局部信息解决异物检测的示例

Figure 1. Examples of combining global contextual information and local information to solve foreign object detection

下载: 全尺寸图片幻灯片

图 2 Focus模块的切片操作

Figure 2. Slicing operation of the Focus module

下载: 全尺寸图片幻灯片

图 3 改进后YOLOV5S网络架构

Figure 3. The network architecture of improved model

下载: 全尺寸图片幻灯片

图 4 各个模块的结构图

Figure 4. Architectures of various blocks

下载: 全尺寸图片幻灯片

图 5 Stem模块架构

Figure 5. The architecture of Stem block

下载: 全尺寸图片幻灯片

图 6 数据采集示意图以及样本图像

Figure 6. Data acquisition schematic and sample images

下载: 全尺寸图片幻灯片

图 7 本文方法在所构建数据集上的检测结果示例

Figure 7. An example of detection results of our method on the constructed dataset

下载: 全尺寸图片幻灯片

表 1 本文所使用数据集的详细统计

Table 1. Detailed statistics of the dataset used in this paper

数据类别	训练集	验证集	测试集	合计
粗绳	389	83	123	595
细绳	325	88	96	509
头发	14	5	5	24
书包	57	13	19	89
塑料袋	396	111	121	628
盒子	58	14	15	87
挎包	230	62	66	358
钱包	440	109	136	690
手机	397	94	134	625
瓶子	575	147	206	928
雨伞	59	20	15	94
人	65	22	10	97
其他	45	8	12	65
正常	421	89	124	634
纸板	330	82	100	512
总计	3 801	947	1 187	5 935

下载: 导出CSV

表 2 实验平台配置

Table 2. Experimental platform configuration

名称	具体参数
操作系统	Ubuntu 18.04
CPU	Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
GPU	NVIDIA TITAN Xp
内存	32GB
深度学习框架	PyTorch

下载: 导出CSV

表 3 本文模型与其他检测模型在所构建数据集上的对比结果

Table 3. Comparison results of our algorithm with other state-of-the-arts on the constructed dataset

模型	输入尺寸/像素	mAP@0.5/%	FPS	模型所占空间/MB
SSD	300×300	85.8	45	97.7
YOLOv3-SPP	640×640	87.9	68	119.0
YOLOv4	640×480	87.6	30	245.0
YOLOv5s	640×640	88.0	192	14.5
YOLOX-L	640×640	86.8	34	364.0
PP-YOLOv1	608×608	84.3	15	178.0
PP-YOLOv2	640×640	85.5	12	279.0
本文方法	640×480	88.5	385	14.4

下载: 导出CSV

表 4 各个模块对模型性能的影响

Table 4. The impact of each module to model's performance

模型	mAP@0.5/%	参数量	运算量/GFLOPs	GPU检测时延/ms	CPU检测时延/ms	模型所占空间/MB	训练时长/h
YOLOv5s	88.0	7 091 668	16.4	5.2	404.4	14.5	9.476
+gc_block	88.7	7 023 547	17.8	6.1	515.9	14.4	11.261
+stem_block	87.8	7 096 324	4.6	2.3	238.2	14.4	6.700
本文方法(+gc+stem)	88.5	7 028 203	4.9	2.6	231.1	14.4	8.265

下载: 导出CSV

表 5 本文模型的精确率、召回率、F1值及mAP值

Table 5. Precision, Recall, F1 value and mAP of the model in this paper

数据类别	本文方法
数据类别	精确率	召回率	F1值	mAP@0.5	mAP@0.5:0.95
粗绳	0.880	0.715	0.789	0.840	0.471
细绳	0.786	0.635	0.702	0.775	0.368
头发	0.791	0.800	0.795	0.855	0.389
书包	0.910	0.737	0.814	0.840	0.502
塑料袋	0.979	0.983	0.981	0.982	0.653
盒子	0.974	1.000	0.987	0.995	0.838
挎包	0.964	0.985	0.974	0.985	0.673
钱包	0.859	0.872	0.865	0.895	0.426
手机	0.961	0.940	0.950	0.981	0.551
瓶子	0.993	0.995	0.994	0.994	0.622
雨伞	0.867	1.000	0.929	0.946	0.609
人	0.956	1.000	0.978	0.995	0.867
其它	0.599	0.667	0.631	0.479	0.225
正常	0.736	0.718	0.727	0.741	0.365
纸板	0.960	0.951	0.955	0.975	0.560
平均	0.881	0.867	0.874	0.885	0.541

下载: 导出CSV

参考文献(27)

[1]	刘伟铭, 陈纲梅, 李海玉, 等. 地铁风险空间分析及异物检测系统技术要求[J]. 铁道标准设计, 2019, 63(10): 168-176. https://www.cnki.com.cn/Article/CJFDTOTAL-TDBS201910032.htm LIU W M, CHEN G M, LI H Y, et al. Risk space analysis and technical requirements for foreign object detection system[J]. Railway Standard Design, 2019, 63(10): 168-176. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-TDBS201910032.htm
[2]	上海地铁. "1月22日"情况说明[EB/OL]. (2022-01-25)[2022-08-27]. https://wei-bo.com/shmetro. Shanghai metro. Factsheet for January 22[EB/OL]. (2022-01-25)[2022-08-27]. https://we-ibo.com/shmetro. (in Chinese)
[3]	李海玉, 刘伟铭, 李军, 等. 曲线地铁站台屏蔽门与列车间异物自动检测装置及方法: 201410314715. 4[P]. 2017-07-07. LI H Y, LIU W M, LI J, et al. Device and method for automatic detection of foreign objects between platform screen doors of curved metro platforms and trains: 201410314715. 4[P]. 2017-07-07(in Chinese)
[4]	谭飞刚, 刘建. 1种基于计算机视觉的地铁站台异物检测算法[J]. 铁路计算机应用, 2017, 26(1): 67-69. https://www.cnki.com.cn/Article/CJFDTOTAL-TLJS201701020.htm TAN F G, LIU J. Foreign object detection algorithm for subway platform based on computer vision[J]. Railway Computer Application, 2017, 26(1): 67-69. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-TLJS201701020.htm
[5]	刘伟铭, 杜逍睿, 李静宁, 等. SOM与HL融合的地铁异物分类算法[J]. 铁道标准设计, 2020, 64(7): 161-165. https://www.cnki.com.cn/Article/CJFDTOTAL-TDBS202007029.htm LIU W M, DU X R, LI J N, et al. Subway foreign object classification based on som and hl fusion[J]. Railway Standard Design, 2020, 64(7): 161-165. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-TDBS202007029.htm
[6]	杨鹏强, 张艳伟, 胡钊政. 基于改进RepVGG网络的车道线检测算法[J]. 交通信息与安全, 2022, 40(2): 73-81. doi: 10.3963/j.jssn.1674-4861.2022.02.009 YANG P Q, ZHANG Y W, HU Z Z. A lane detection algorithm based on improved repvgg network[J]. Journal of Transport Information and Safety, 2022, 40(2): 73-81. (in Chinese) doi: 10.3963/j.jssn.1674-4861.2022.02.009
[7]	王鹏, 神和龙, 尹勇, 等. 基于深度学习的船舶驾驶员疲劳检测算法[J]. 交通信息与安全, 2022, 40(1): 63-71. doi: 10.3963/j.jssn.1674-4861.2022.01.008 WANG P, SHEN H L, YIN Y, et al. A detection algorithm for the fatigue of ship officers based on deep learning technique[J]. Jour-nal of Transport Information and Safety, 2022, 40(1): 63-71. (in Chinese) doi: 10.3963/j.jssn.1674-4861.2022.01.008
[8]	崔晓宁, 王起才, 李盛, 等. 基于YOLO-v5的双块式轨枕裂缝智能识别[J]. 铁道学报, 2022, 44(4): 104-111. https://www.cnki.com.cn/Article/CJFDTOTAL-TDXB202204013.htm CUI X N, WANG Q C, LI S, et al. Intelligent recognition of cracks in double block sleeper based onYOLO-v5[J]. Journal of the China Railway Society, 2022, 44(4): 104-111. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-TDXB202204013.htm
[9]	DAI Y, LIU W M, LI H Y, et al. Efficient foreign object detection between psds and metro doors via deep neural networks[J]. IEEE Access, 2020(8): 46723-46734.
[10]	刘伟铭, 温俊锐, 郑仲星, 等. 适用于地铁异物前景检测的神经网络: DifferentNet[J]. 华南理工大学学报(自然科学版), 2021, 49(10): 11-21, 40. https://www.cnki.com.cn/Article/CJFDTOTAL-HNLG202110002.htm LIU W M, WEN J R, ZHENG Z X, et al. Differentnet: neural network for foreign objects foreground detection in metro[J]. Journal of South China University of Technology(Natural Science Edition), 2021, 49(10): 11-21, 40. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-HNLG202110002.htm
[11]	LIU R K, LIU W M, LI H Y, et al. Metro anomaly detection based on light strip inductive key frame extraction and magan network[J]. IEEE Transactions on Instrumentation and Measurement, 2022(71): 1-14.
[12]	LIU W, ANGUELOV D, ERHAN D, et al. Ssd: single shot multibox detector[C]. The European Conference on Computer Vision, Amsterdam, The Netherlands: Springer, 2016.
[13]	REDMON J, FARHADI A. Yolov3: an incre-mental improvement[EB/OL]. (2018-04-08)[2022-08-27]. https://arxiv.org/pdf/1804.02767.pdf
[14]	BOCHKOVSKI A, WANG C Y, LIAO H Y. Yolov4: optimal speed and accuracy of object detection[EB/OL]. (2020-04-23)[2022-08-27]. https://arxiv.org/pdf/2004.10934.pdf.
[15]	Ultralytics. Yolov5[CP/OL]. (2022-02-0-9)[2022-08-27]. https://github.com/ultralytics/yolov5.
[16]	LI J N, WEI Y C, LIANG X D, et al. Att-entive contexts for object detection[J]. IEEE Transactions on Multimedia, 2017, 19(5): 944-954.
[17]	SEAN B C, LAWRENCE Z, KAVITA B, et al. Inside-outside net: detecting objects in context with skip pooling and recurrent neural networks[C]. The IEEE/CVF conference on Computer Vision and Pattern Recognition, Las Vegas, USA: IEEE, 2016.
[18]	CHEN Q, ZHENG S, JI D, et al. Contextualizing object detection and classification[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 31(1), 13-27.
[19]	HAN C, LIN J, LIN Y J, et al. Enable deep learning on mobile devices: methods, systems, and applications[J]. ACM Transactions on Design Automation of Electronic Systems, 2022, 27(3): 1-50.
[20]	CAO Y, XU J R, LIN S, et al. Gcnet: non-local networks meet squeeze-excitation networks and beyond[C]. IEEE/CVF International Conference on Computer Vision Workshops, Seoul, Korea(South): IEEE, 2019.
[21]	WANG J, BOHN T A, LING C X. Pelee: a real-time object detection system on mobile devices[C]. Advances in Neural Information Processing Systems, Montréal, Canada: NeurIPS Foundation, 2018.
[22]	WANG C Y, LIAO H Y, WU Y H, et al. Cspnet: a new backbone that can enhance learning capability of cnn[C]. IEEE/CVF conference on Computer Vision and Pattern Recognition workshops, Seattle, USA: IEEE, 2020.
[23]	ZHENG Z H, WANG P, LIU W, et al. Distance-iou loss: faster and better learning for bounding box regression[C]. AAAI Conference on Artificial Intelligence, New York, USA: AAAI, 2020.
[24]	WANG X L, GIRSHICK R B, GUPTA A, et al. Non-local neural networks[C]. IEEE/CVF conference on Computer Vision and Pattern Recognition, Salt Lake City, USA: IEEE, 2018.
[25]	HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]. IEEE/CVF conference on Computer Vision and Pattern Recognition, Salt Lake City, USA: IEEE, 2018.
[26]	GE Z, LIU S T, WANG F, et al. Yolox: exceeding yolo series in 2021[EB/OL]. (2021-08-06)[2022-08-27]. https://arxiv.org/pdf/2107.08430.pdf.
[27]	SIMONYAN K, ANDREW Z. Very deep convolutional networks for large-scale image recognition[C]. International Conference on Learning Representations, San Diego, USA: ICLR, 2015.