Volume 42 Issue 6
Dec.  2024
Turn off MathJax
Article Contents
LIAO Huimin, LUO Jingming, ZHANG Jinghui, LIU Wenping, DONG Wanqing, XIAO Hui, HUANG Jian. A Recognition Model for Passenger Boarding and Alighting Action Based on Improved Temporal Pyramid Network[J]. Journal of Transport Information and Safety, 2024, 42(6): 95-102. doi: 10.3963/j.jssn.1674-4861.2024.06.010
Citation: LIAO Huimin, LUO Jingming, ZHANG Jinghui, LIU Wenping, DONG Wanqing, XIAO Hui, HUANG Jian. A Recognition Model for Passenger Boarding and Alighting Action Based on Improved Temporal Pyramid Network[J]. Journal of Transport Information and Safety, 2024, 42(6): 95-102. doi: 10.3963/j.jssn.1674-4861.2024.06.010

A Recognition Model for Passenger Boarding and Alighting Action Based on Improved Temporal Pyramid Network

doi: 10.3963/j.jssn.1674-4861.2024.06.010
  • Received Date: 2023-12-24
    Available Online: 2025-03-08
  • Traditional algorithms for identifying illegal passenger-carrying behavior, which rely on image processing techniques, utilize manually crafted human-vehicle interaction rules to discern boarding and alighting actions. However, these rule sets often fall short due to the intricate nature of traffic scenarios, resulting in suboptimal recognition performance. Therefore, a deep learning model based on a temporal pyramid network(TPN) is introduced for boarding and alighting action recognition. By training on a large dataset, more complete features of taxi passenger boarding and alighting behaviors are extracted to improve recognition accuracy. To address the issue of the TPN model not distinguishing between driver and passenger roles, the output layer is redesigned based on door area perception. This modification enhances the efficiency of multi-dimensional feature extraction. To tackle the issue of the large spatiotemporal span in boarding and alighting actions, which leads to interference from irrelevant movements, a sliding window mechanism is introduced. This mechanism, based on dynamic window weights, captures key video frames of the actions, enhancing recognition efficiency. Based on the above improvement measures, a boarding and alighting neural network(BANN) model, based on door area perception and dynamic weights, is proposed to efficiently and accurately recognize illegal passenger-carrying behaviors. A training dataset with 4, 047 annotated video clips and a test dataset with 810 unannotated video clips are constructed for model performance validation based on surveillance videos from Beijing Capital Airport. Experimental results demonstrate that the BANN model achieves precision and recall rates of 90.21% and 88.53%, respectively, representing improvements of 9.78% and 11.04% over the baseline TPN model. These results indicate that the BANN model can effectively meet the needs of traffic order supervision in transportation hubs.

     

  • loading
  • [1]
    寇敏, 张萌萌, 赵军学, 等. 道路交通安全风险辨识与分析方法综述[J]. 交通信息与安全, 2022, 40(6): 22-32. doi: 10.3963/j.jssn.1674-4861.2022.06.003

    KOU M, ZHANG M M, ZHAO J X, et al. A Review of identification and analysis methods for road safety risk[J]. Journal of Transport Information and Safety, 2022, 40(6): 22-32. (in Chinese) doi: 10.3963/j.jssn.1674-4861.2022.06.003
    [2]
    张博, 庞基敏, 章文嵩, 等. 互联网大数据技术在智慧交通发展中的应用[J]. 科技导报, 2020, 38(9): 47-54.

    ZHANG B, PANG J M, ZHANG W S, et al. Application of internet big data technology in the development of smart transportation[J]. Science & Technology Review, 2020, 38(9): 47-54. (in Chinese)
    [3]
    李熙莹, 陆强, 张晓春, 等. 基于人车交互行为模型的上下客行为识别[J]. 中国公路学报, 2021, 34(7): 152-163. doi: 10.3969/j.issn.1001-7372.2021.07.013

    LI X Y, LU Q, ZHANG X C, et al. Boarding and alighting behavior recognition based on human-vehicle interaction behavior model[J]. China Journal of Highway and Transport, 2021, 34(7): 152-163. (in Chinese) doi: 10.3969/j.issn.1001-7372.2021.07.013
    [4]
    王隽. 基于机器视觉的高速公路服务区违法上下客识别应用研究[J]. 时代汽车, 2022(14): 196-198 doi: 10.3969/j.issn.1672-9668.2022.14.069

    WANG J. Application research of illegal boarding and alighting recognition in expressway service area based on machine Vision[J]. Auto Time, 2022(14): 196-198. (in Chinese) doi: 10.3969/j.issn.1672-9668.2022.14.069
    [5]
    贺艺斌, 田圣哲, 兰贵龙. 基于改进Faster-RCNN算法的行人检测[J]. 汽车实用技术, 2022, 47 (05): 34-37.

    HE Y B, TIAN S Z, LAN G L. Pedestrian detection based on improved faster-RCNN algorithm[J]. Automobile Applied Technology, 2022, 47(05): 34-37. (in Chinese)
    [6]
    张若杨, 贾克斌, 刘鹏宇. 视频监控中私自揽客违法行为检测[J]. 计算机应用与软件, 2019, 36 (3): 168-173, 209. doi: 10.3969/j.issn.1000-386x.2019.03.031

    ZHANG R Y, JIA K B, LIU P Y. Illegal behavior detection of carrying passengers privately in video surveillance[J]. Computer Applications and Software, 2019, 36(03): 168-173, 209. (in Chinese) doi: 10.3969/j.issn.1000-386x.2019.03.031
    [7]
    房春瑶, 贾克斌, 刘鹏宇. 基于监控视频的出租车违规私揽行为识别[J]. 计算机仿真, 2020, 37 (5): 326-331. doi: 10.3969/j.issn.1006-9348.2020.05.066

    FANG C Y, JIA K B, LIU P Y. Identification of taxi violation behavior based on surveillance video[J]. Computer Simulation, 2020, 37 (5): 326-331. (in Chinese) doi: 10.3969/j.issn.1006-9348.2020.05.066
    [8]
    JI S, XU W, YANG M, et al. 3D convolutional neural networks for human action recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 35(1): 221-231.
    [9]
    TRAN D, BOURDEV L, FERGUS R, et al. Learning spatiotemporal features with 3D convolutional networks[C]. International Conference on Computer Vision, Boston, USA: IEEE, 2015.
    [10]
    CARREIRA J, ZISSERMAN A. QUO VADIS, Action recognition? a new model and the kinetics dataset[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA: IEEE/CVF, 2017.
    [11]
    TRAN D, WANG H, TORRESANI L, et al. A closer look at spatiotemporal convolutions for action recognition[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA: IEEE/CVF, 2018.
    [12]
    HUANG D A, RAMANATHAN V, MAHAJAN D, et al. What makes a video a video: analyzing temporal information in video understanding models and datasets[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA: IEEE/CVF, 2018.
    [13]
    FEICHTENHOFER C, FAN H, MALIK J, et al. Slowfast networks for video recognition[C]. International Conference on Computer Vision, Seoul, Korea (South): IEEE/CVF, 2019.
    [14]
    YANG C, XU Y, SHI J, et al. Temporal pyramid network for action recognition[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA: IEEE, 2020.
    [15]
    HAN K, XIAO A, WU E, et al. Transformer in transformer[J]. Advances in neural information processing systems, 2021, 34: 15908-15919.
    [16]
    HAN K, WANG Y, CHEN H, et al. A survey on vision transformer[J]. IEEE Transactions on Pattern Analysis and Machine intelligence, 2022, 45(1): 87-110.
    [17]
    BERTASIAS G, WANG H, TORRESANI L. Is space-time attention all you need for video understanding?[C]. International Conference on Machine Learning, Vienna, Austria: IMLS, 2021.
    [18]
    杨世强, 罗晓宇, 乔丹, 等. 基于滑动窗口和动态规划的连续动作分割与识别[J]. 计算机应用, 2019, 39(2): 348-353.

    YANG S Q, LUO X Y, QIAO D et al. Continuous action segmentation and recognition based on sliding window and dynamic programming[J]. Journal of Computer Applications, 2019, 39(2): 348-353. (in Chinese)
    [19]
    HARA K, KATAOKA H, SATOH Y. Learning spatio-temporal features with 3D residual networks for action recognition[C]. International Conference on Computer Vision Workshops, Lido Island, Venice, Italy: IEEE, 2017.
    [20]
    ZHANGE D, ZHANG H, TANG J, et al. Feature pyramid transformer[C]. Computer Vision-ECCV 2020: 16th European Conference, Glasgow, UK: European Computer Vision Association, 2020.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(10)  / Tables(5)

    Article Metrics

    Article views (82) PDF downloads(10) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return