留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于改进TSM的船舶驾驶员行为识别方法

陈晨 魏月楠 马枫 胡松涛 王腾飞

陈晨, 魏月楠, 马枫, 胡松涛, 王腾飞. 基于改进TSM的船舶驾驶员行为识别方法[J]. 交通信息与安全, 2025, 43(1): 120-129. doi: 10.3963/j.jssn.1674-4861.2025.01.011
引用本文: 陈晨, 魏月楠, 马枫, 胡松涛, 王腾飞. 基于改进TSM的船舶驾驶员行为识别方法[J]. 交通信息与安全, 2025, 43(1): 120-129. doi: 10.3963/j.jssn.1674-4861.2025.01.011
CHEN Chen, WEI Yuenan, MA Feng, HU Songtao, WANG Tengfei. A Novel Ship Driver Behavior Recognition Approach Based on Improved TSM[J]. Journal of Transport Information and Safety, 2025, 43(1): 120-129. doi: 10.3963/j.jssn.1674-4861.2025.01.011
Citation: CHEN Chen, WEI Yuenan, MA Feng, HU Songtao, WANG Tengfei. A Novel Ship Driver Behavior Recognition Approach Based on Improved TSM[J]. Journal of Transport Information and Safety, 2025, 43(1): 120-129. doi: 10.3963/j.jssn.1674-4861.2025.01.011

基于改进TSM的船舶驾驶员行为识别方法

doi: 10.3963/j.jssn.1674-4861.2025.01.011
基金项目: 

国家自然科学基金项目 52201415

国家自然科学基金项目 52171352

国家重点研发计划项目 2023YFB4302300

水路交通控制全国重点实验室开放课题项目 16-10-1

详细信息
    作者简介:

    陈晨(1985—),博士,讲师. 研究方向:人工智能. E-mail: chenchen0120@wit.edu.cn

    通讯作者:

    马枫(1985—),博士,研究员. 研究方向:智能船舶. E-mail: martin7wind@whut.edu.cn

  • 中图分类号: U676.1

A Novel Ship Driver Behavior Recognition Approach Based on Improved TSM

  • 摘要: 船舶驾驶员不规范操作是诱发水上交通事故重要因素,设计1种实时船舶驾驶员行为检测方法意义重大。相比汽车驾驶、安防监控等,船舶驾驶舱环境更为复杂,存在无法兼顾多个船员、效率低下和准确率不高等问题。针对这种情况,研究了1种多目标跟踪和行为识别相结合的“两步式”多人行为识别方法。利用YoloV7与ByteTracker建立多目标跟踪器,形成单人的连续特征图。在单目标行为识别算法时间偏移模块(temporal shift module,TSM)的基础上,借助超采样、跨帧拼接等手段处理连续特征图,同时通过EfficientNet-B3与坐标注意力(coordinate attention,CA)模块输出高准确率的识别结果。研究建立了船舶驾驶舱行为数据集“SC-Action”,数据来自不同的船舶驾驶舱监控录像,包含常规行为以及违规行为共计2 000例行为样本。在该数据集上对本文提出的模型进行迁移学习和消融实验,实验结果表明:提出的方法可实现3名驾驶员24帧/s的实时行为识别,识别速度和准确率均优于主流算法。在针对单人行为识别的测试中,方法在应用图像增强模块之后,相比基准TSM模型准确率提升了1.3%;结合注意力机制后,准确率进一步提升1.78%,达到了82.1%,而运算量仅增加0.1%。在多目标测试中,方法的实际推理速度和效果,也超越了该领域的主流方法如SlowFast,验证了其有效性。

     

  • 图  1  多目标行为识别方法结构图

    Figure  1.  A structure of the multi-target behavior recognition approach

    图  2  使用不同算法进行超分辨率增强效果对比

    Figure  2.  Comparison of super-resolution enhancement effects using different algorithms

    图  3  不同骨干网络在“SC-Action”数据集上的效果

    Figure  3.  The performance of different backbone networks on the SC-Action dataset

    图  4  CA注意力结构

    Figure  4.  The structure of Coordinate Attention

    图  5  “SC-Action”数据集样本示例

    Figure  5.  Examples of the SC-Action dataset

    图  6  部分网络训练损失收敛情况

    Figure  6.  Convergence of training losses for some networks

    图  7  优化数据预处理方法前后训练损失值收敛情况对比

    Figure  7.  Comparison of training loss values with and without customized preprocessing method

    图  8  各网络浅层特征图对比

    Figure  8.  Comparison of shallow feature maps of different networks

    图  9  视频推理效果对比

    Figure  9.  Comparison of video inference effect

    表  1  主干网络结构

    Table  1.   The Structure of core network

    阶段 操作模块 分辨率 通道 堆叠层数
    1 Conv3x3 448×448 32 1
    2 MBConv1, k3x3 224×224 16 2
    3 MBConv6, k3x3 224×224 24 3
    4 MBConv6, k5x5 112×112 40 3
    5 MBConv6, k3x3 56×56 80 5
    6 MBConv6, k5x5 28×28 112 5
    7 MBConv6, k5x5 28×28 192 6
    8 MBConv6, k3x3 14×14 320 2
    9 Conv1x1 14×14 1 280 1
    10 CA_Block 14×14 1 280 1
    11 Pooling&FC 14×14 7 1
    下载: 导出CSV

    表  2  不同模型的运算量,参数量,识别准确率对比

    Table  2.   Comparison of the computational load, parameter amount and recognition accuracy of different model

    模型 特征提取网络 运算量/GMAC 参数量/M Top-1准确率/%
    TSM ResNet50 132.17 23.52 77.79
    ResNet101 251.62 42.51 81.4
    SlowFast ResNet50 101.16 33.66 75.1
    ResNet101 163.88 52.87 76.66
    优化后 Improved 32.28 10.93 82.1
    EfficientNet_B3
    下载: 导出CSV

    表  3  消融实验数据

    Table  3.   Ablation experiment data

    方法 M1 M2 M3
    TSM- EfficientNet-B3
    +图像增强
    +CA注意力
    运算量/GMAC 32.23 32.23 32.26
    准确率 79.03 80.32 82.1
    下载: 导出CSV

    表  4  各方法视频推理帧率

    Table  4.   Comparison of video inference frame rates

    方法 间隔帧数 平均帧率(/帧/s)
    SlowFast-ResNet50 0 10
    ByteTrack+TSM-ResNet50 0 13
    0 15
    本文方法 5 19
    10 24
    下载: 导出CSV
  • [1] 王晓, 余永华, 董旭, 等. 智能机舱验证平台设计与开发[J]. 船海工程, 2024, 53(4): 24-28, 35.

    WANG X, YU Y H, DONG X, et al. Design and development of the intelligent engine cabin verification platform[J]. Ship & Ocean Engineering, 2024, 53(4): 24-28, 35. (in Chinese)
    [2] 黄亮, 张治豪, 文元桥, 等. 基于轨迹特征的船舶停留行为识别与分类[J]. 交通运输工程学报, 2021, 21(5): 189-198.

    HUANG L, ZHANG Z H, WEN Y Q, et al. Stopping behavior recognition and classification of ship based on trajectory characteristics[J]. Journal of Traffic and Transportation Engineering, 2021, 21(5): 189-198. (in Chinese)
    [3] CHEN J H, DI Z J, SHI J, et al. Marine oil spill pollution causes and governance: a case study of Sanchi tanker collision and explosion[J]. Journal of Cleaner Production, 2020, 273: 122978. doi: 10.1016/j.jclepro.2020.122978
    [4] ZHANG J, WU Z, LI F, et al. Attention-based convolutional and recurrent neural networks for driving behavior recognition using smartphone sensor data[J]. IEEE Access, 2019, 7: 148031-148046. doi: 10.1109/ACCESS.2019.2932434
    [5] 苏晨阳, 武文红, 牛恒茂, 等. 深度学习的工人多种不安全行为识别方法综述[J]. 计算机工程与应用, 2024, 60(5): 30-46.

    SU C Y, WU W H, NIU H M, et al. Review of deep learning approaches for recognizing multiple unsafe behaviors in workers[J]. Computer Engineering and Applications, 2024, 60(5): 30-46. (in Chinese)
    [6] 张平, 迟志诚, 陈一凡, 等. 用于自动驾驶车辆的融合注意力机制多目标跟踪算法[J]. 汽车安全与节能学报, 2021, 12 (4): 516-521. doi: 10.3969/j.issn.1674-8484.2021.04.010

    ZHANG P, CHI Z C, CHEN Y F, et al. Multiple object tracking algorithm integrated with attention mechanism for autonomous vehicles[J]. Journal of Automotive Safety and Energy, 2021, 12(04): 516-521. (in Chinese) doi: 10.3969/j.issn.1674-8484.2021.04.010
    [7] ZHANG Y, SUN P, JIANG Y, et al. ByteTrack: multi-object tracking by associating every detection box[C]. Computer Vision-ECCV 2022, Israel: ECCV, 2022.
    [8] 姜杰, 张立民, 刘凯, 等. 基于改进PP-YOLOE和ByteTrack算法的红外船舶目标检测跟踪方法[J]. 兵器装备工程学报, 2024, 45(11): 291-297. doi: 10.11809/bqzbgcxb2024.11.037

    JIANG J, ZHANG L, LIU K, et al. Research on infrared ship target detection and tracking method based on improved pp-yoloe and bytetrack algorithms[J]. Journal of Ordnance Equipment Engineering, 2024, 45(11): 291-297. (in Chinese) doi: 10.11809/bqzbgcxb2024.11.037
    [9] 陈信强, 王美琳, 李朝锋, 等. 基于深度学习与多级匹配机制的港区人员轨迹提取[J]. 交通运输系统工程与信息, 2023, 23(4): 70-79.

    CHEN X Q, WANG M L, LI C F, et al. Port staff trajectory extraction based on deep learning and multi-level matching mechanism[J]. Journal of Transportation Systems Engineering and Information Technology, 2023, 23(4): 70-79. (in Chinese)
    [10] 高庆吉, 徐达, 罗其俊, 等. 基于深层动态特征双流网络的高效行为识别算法[J]. 计算机应用与软件, 2024, 41(9): 175-181, 189.

    GAO Q J, XU D, LUO Q J, et al. An efficient action recognition algorithm based on deep dynamic feature dual-stream cnn[J]. Computer Applications and Software, 2024, 41(9): 175-181, 189. (in Chinese)
    [11] LIN J, GAN C, HAN S. TSM: temporal shift module for efficient video understanding[C]. International Conference on Computer Vision(ICCV), Seoul, Korea: ICCV, 2019.
    [12] 胡宏宇, 黎烨宸, 张争光, 等. 基于多尺度骨架图和局部视觉上下文融合的驾驶员行为识别方法[J]. 汽车工程, 2024, 46(1): 1-8, 28.

    HU H Y, LI Y C, ZHANG Z G, et al. Driver behavior recognition based on multi-scale skeleton graph and local visual context method[J]. Automotive Engineering, 2024, 46(1): 1-8, 28. (in Chinese)
    [13] 吴建清, 张子毅, 王钰博, 等. 考虑多模态数据的重载货车危险驾驶行为识别方法[J]. 交通运输系统工程与信息, 2024, 24(2): 63-75.

    WU J Q, ZHANG Z Y, WANG Y B, et al. Method for identifying dangerous driving behaviors in heavy-duty trucks based on multi-modal data[J]. Journal of Transportation Systems Engineering and Information Technology, 2024, 24(2): 63-75. (in Chinese)
    [14] WANG S, CHEN M, RATNAVELU K, et al. Online classroom student engagement analysis based on facial expression recognition using enhanced yolov5 for mitigating cyber-bullying[J]. Measurement Science and Technology, 2024, 36(1): 015419.
    [15] 章宇翔, 李先旺, 贺德强, 等. 基于改进的多算法融合地铁站内乘客行为识别[J]. 铁道科学与工程学报, 2023, 20 (11): 4096-4106.

    ZHANG Y X, LI X W, HE D Q. et al. Passenger action recognition in subway stations based on improved multi-algorithm fusion[J]. Journal of Railway Science and Engineering, 2023, 20(11): 4096-4106. (in Chinese)
    [16] 张孝杰, 张艳伟, 邹鹰, 等. 基于改进YOLOv7的码头作业人员检测算法[J]. 交通信息与安全, 2024, 42(2): 67-75. doi: 10.3963/j.jssn.1674-4861.2024.02.007

    ZHANG X J, ZHANG Y W, ZOU Y, et al. An improved yolov7 algorithm for workers detection in port terminals[J]. Journal of Transport Information and Safety, 2024, 42(2): 67-75. (in Chinese) doi: 10.3963/j.jssn.1674-4861.2024.02.007
    [17] FEICHTENHOFER C, FAN H, MALIK J, et al. SlowFast networks for video recognition[C]. International Conference on Computer Vision(ICCV), Seoul, Korea: IEEE, 2019.
    [18] SREELAKSHMY I J, KOVOOR B C. Generative inpainting of high-resolution images: redefined with Real-ESRGAN[J]. International Journal of Artificial Intelligence Tools, 2022, 31(5): 2250035.
    [19] WANG X, YU K, WU S, et al. ESRGAN: enhanced super-resolution generative adversarial networks[C]. Computer Vision-ECCV 2018 Workshops, Munich, Germany: ECCV, 2019.
    [20] ZHOU A, MA Y, JI W, et al. Multi-head attention-based two-stream EfficientNet for action recognition[J]. Multimedia Systems, 2023, 29(2): 487-498.
    [21] LI W D, LI Z Y, WANG C S, et al. An improved SSD light-weight network with coordinate attention for aircraft target recognition in scene videos[J]. Journal of Intelligent & Fuzzy Systems, 2024, 46(1): 355-368.
    [22] RUSSAKOVSKY O, DENG J, SU H, et al. ImageNet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115(3): 211-252.
  • 加载中
图(9) / 表(4)
计量
  • 文章访问数:  11
  • HTML全文浏览量:  6
  • PDF下载量:  0
  • 被引次数: 0
出版历程
  • 收稿日期:  2024-07-24
  • 网络出版日期:  2025-06-27

目录

    /

    返回文章
    返回