A Knowledge Graph of Ship Collision Prevention and Control Based on Multi-source Heterogeneous Information
-
摘要: 传统水上交通事故研究主要利用事故案例挖掘事故致因和事故间相互影响关系,在反映事故全过程和人-船-货-环-管-信等要素间相互作用方面存在不足。为此,以船舶碰撞事件为例,基于多源异构信息构建了水上交通事故领域船舶碰撞事故防控知识图谱。充分考虑“事件-时空行为-事件致因-事件后果-责任主体-处置决策”事故组成要素,提出了船舶碰撞事故知识标准化框架;构建了基于中文全词掩码预训练语言模型(Chinese-bert-wwm)的知识抽取模型;依托Neo4j数据库,构建了船舶碰撞事故防控知识图谱,图谱包括15种实体类型和39种关系类型,包含35 784个实体和325 097个关系。所提船舶碰撞事故防控知识图谱,在规模上显著优于现有水上交通领域的知识图谱,知识自动抽取的精度达到85%,明显高于隐马尔可夫模型(hidden Markovmodel,HMM)和条件随机场(conditional random field,CRF)等模型。其中,“船舶”“人员特征”“时间”“人员”和“法律法规”类实体上下文推理的F1值分别为95%、91%、89%、88%和88%,关系识别的F1值达到94%。以上结果表明:通过Chinese-bert-wwm模型提取船舶碰撞事故的语义特征,增强了知识抽取模型的泛化能力。本研究不仅可以支持对船舶碰撞事故知识表示、海事执法人员对事故的回溯及利用,也有助于提高水上交通系统的管理效能。
-
关键词:
- 水上交通安全 /
- 船舶碰撞事故 /
- 知识图谱 /
- Chinese-bert-wwm
Abstract: Traditional research on water transportation accidents mainly focuses on exploring the causative factors and corresponding complex relationship with various accidents, which is insufficient in reflecting the evolution of traffic accidents and the complicated interactions between elements including people, vessels, cargo, environment, administration, and information in the maritime system. To fill the gap, this paper proposes a methodology for developing a water transportation knowledge graph based on multi-source heterogeneous information and applies it to the accident prevention and control strategies development. A framework for ship collision knowledge is designed, considering the components of accidents, e.g., event, spatiotemporal ship behavior, maritime accidents causative factors, accidents consequences, corresponding responsibility roles, and disposal decision-making. A knowledge extraction model is employed to extract the maritime safety knowledge, which is based on Chinese Bidirectional Encoder Representations from Transformers Whole Word Masking and is named as Chinese-bert-wwm model. Thirdly, the SCPCKG (ship collision prevention and control knowledge graph) is developed based on the Neo4j database, which contains 35 784 entities from 15 entity types and 325 097 relationships from 39 relationship types. The scale of the SCPCKG is significantly larger than that of existing knowledge graphs in the field of water transportation, and the accuracy of automated knowledge extraction based on the proposed SCPCKG reaches 85%, which is higher than the existing models, such as Hidden Markov Models (HMMs) and Conditional Random Fields (CRFs). Specifically, the F1 -score value for identifying"ship", "person characteristics", "time", "person", and"laws"entities reaches 95%, 91%, 98%, 88%, and 88%, respectively; the F1 -score value of relationship extraction reaches 94%. The results show that the proposed Chinese-bert-wwm model can enhance the generalized capability of the knowledge extraction model by extracting the semantic features of ship collision accidents from the accident reports, and the proposed SCPCKG can be used for the knowledge representation of ship collision accidents and inversion of accidents for maritime administrators, improving the effectiveness of the water transportation management. -
表 1 各类实体及属性定义
Table 1. Definitions of the various types of entities and attributes
类型 名称 定义 实体 事故 串联所有实体的重要一环 船舶 作为知识图谱的骨干,船舶实体直接从信息源中提取并链接到其他实体类型 船舶动态 事故发展过程中,船舶行为一直处于动态变化中,船舶动态记录事故的演变过程 人员 人员是整个事故的重要一环,人员的行为、决策和技能直接影响着事故的发展,与多类实体有链接关系 组织 包括海上安全调查机构、船舶检验机构、船舶修造厂、航运管理公司等其他机构,在事故的管理、调查、处理和预防中扮演着重要角色 时间 包括事件节点时间、船舶建造和审核时间、人员从业时间和机构成立时间等,有助于对事故发展的全过程进行追踪和分析,其主要形式为**年**月**日****时 位置 包括国家、各级行政区、各类海洋功能区和人工地点等,为事故的空间分析提供了地理背景和定位依据 环境 环境因素是影响事故发生和发展的重要背景信息,包括气象、水温、交通流环境等,例如风力**级、流速**节、通航环境复杂等 设备 包括船舶设备、机舱设备、救助设备和执法机构的执法设备,比如船载雷达、海上红外监控等,设备的功能、性能和故障情况直接影响事故的发生与处理,是分析事故成因和制定预防措施的重要因素 原因 包括导致事故发生的一系列原因,由人为错误、设备故障、环境条件和管理疏忽等因素产生 结果 包括事故最终造成的人员损伤、经济损失、环境影响、事故等级和船体受损情况等,对于评估事故的严重性和制定改进措施至关重要 法律法规 包括事件发生后可能涉及触犯的所有法律条款,为事故的后续处置提供法律依据 建议 包括对人员的建议和对机构的建议, 机构的建议包含对企业和监管机构的建议 属性 船舶特征 船舶实体都具有多种属性,例如MMSI、IMO、船籍、船舶尺寸等 人员特征 人员实体具有多种属性,例如姓名、年龄、学历、性别等相关属性,为事故分析提供了关于人员行为和决策的背景参考 表 2 关系类型
Table 2. Relation types
类型 关系名称 类型 关系名称 属性主客关系 of_船舶特征 of_违反 of_人员特征 of_后果 发现 因果关系 of_原因 任职 产生_of_原因 管理 of_建议 持有 on_of_实时动态 装备 on_of_航行状态 调度 on_of_主机状态 发生事故 at_of_实时动态 救助 at_of_航行状态 概念之间的隶属关系 使用 at_of_主机状态 会遇 on_位置 通知 时空关系 go_位置 调查 at_时间 报告 leave_位置 隶属 at_in_环境 操纵_of_实时动态 at_on_位置 操纵_of_主机状态 in_环境 操纵_of_航行状态 to_时间 时间_to_时间 表 3 数据集统计
Table 3. Dataset statistics
数据设置 语句量 词元数 实体数 训练集 7 650 634 503 42 298 验证集 850 69 551 6 553 总量 8 500 704 054 48 851 表 4 模型参数设置
Table 4. Model parameter settings
参数 设置 max_seq_len 512 train_batch_size 32 dev_batch_size 8 bert_learning_rate 3×10-5 crf_learning_rate 3×10-3 bert_hidden_size 768 lstm_hidden_size 128 Dropout rate 0.01 optimizer Adam save_step 200 epochs 100 表 5 对比模型实验结果
Table 5. Comparison model experimental results
模型 精确度 召回率 F1值 HMM 0.55 0.60 0.58 CRF 0.68 0.70 0.69 Bi-lstm_CRF 0.62 0.64 0.63 Chinese-bert-wwm_Bi-lstm_CRF 0.85 0.86 0.85 -
[1] LI M, MOU J, CHEN P, et al. Real-time collision risk based safety management for vessel traffic in busy ports and waterways[J]. Ocean & Coastal Management, 2023, 234: 106471. [2] CHEN J, DI Z, SHI J, et al. Marine oil spill pollution causes and governance: a case study of Sanchi tanker collision and explosion[J]. Journal of Cleaner Production, 2020, 273: 122978. doi: 10.1016/j.jclepro.2020.122978 [3] HWANG T, YOUN I-H. Latent-cause extraction model in maritime collision accidents using text analytics on korean maritime accident verdicts[J]. Applied Sciences. 2022, 12(2): 914. doi: 10.3390/app12020914 [4] 余晨, 毛喆, 高嵩. 基于规则的海事自由文本信息抽取方法研究[J]. 交通信息与安全, 2017, 35(2): 40-47. doi: 10.3963/j.issn.1674-4861.2017.02.007YU C, MAO Z, GAO S. Research on rule-based maritime free text information extraction method[J]. Journal of Transport Information and Safety, 2017, 35(2): 40-47. (in Chinese) doi: 10.3963/j.issn.1674-4861.2017.02.007 [5] 刘正江, 吴兆麟. 基于船舶碰撞事故调查报告的人的因素数据挖掘[J]. 中国航海, 2004(2): 1-7.LIU Z J, WU Z L. Human factors data mining based on ship collision accident investigation reports[J]. China Navigation, 2004(2): 1-7. (in Chinese) [6] 冯胤伟, 刘正江, 蒋子怡, 等. 基于关联规则挖掘和复杂网络理论的船舶碰撞事故影响因素分析[J]. 大连海事大学学报, 2023, 49(3): 31-44.FENG Y W, LIU Z J, JIANG Z Y, et al. Analysis of influencing factors of ship collision accidents based on association rule mining and complex network theory[J]. Journal of Dalian Maritime University, 2023, 49(3): 31-44. (in Chinese). [7] LEE J S, LEE B K, CHO I S. Text mining analysis technique on ecdis accident report[J]. Journal of the Korean Society of Marine Environment and Safety, 2019, 25(4): 405-412. (in Korean) doi: 10.7837/kosomes.2019.25.4.405 [8] 张永军, 程鑫, 李彦胜, 等. 利用知识图谱的国土资源数据管理与检索研究[J]. 武汉大学学报(信息科学版), 2022, 47(8): 1165-1175.ZHANG Y J, CHENG X, LI Y S, et al. Research on land and resources management and retrieval using knowledge graph[J]. Geomatics and Information Science of Wuhan University, 2022, 47(8): 1165-1175. (in Chinese) [9] SUISSA O, ZHITOMIRSKY-GEFFET M, ELMALECH A. Question answering with deep neural networks for semi-structured heterogeneous genealogical knowledge graphs[J]. Semantic Web, 2022, 14(2): 209-237. [10] ZHOU C, WANG H, WANG C, et al. Geoscience knowledge graph in the big data era[J]. Science China Earth Sciences, 2021, 64(7): 1105-1114. doi: 10.1007/s11430-020-9750-4 [11] SHAO B, LI X, BIAN G. A survey of research hotspots and frontier trends of recommendation systems from the perspective of knowledge graph[J]. Expert Systems with Applications, 2020: 113764. [12] 黄恒琪, 于娟, 廖晓, 等. 知识图谱研究综述[J]. 计算机系统应用, 2019, 28(6): 1-12.HUANG H Q, YU J, LIAO X, et al. A review of knowledge graph research[J]. Journal of Computer Systems and Applications, 2019, 28(6): 1-12. (in Chinese) [13] LIU S, WANG F. Knowledge graph of maritime collision avoidance rules in Chinese[C]. 11th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC), Hangzhou, China: Zhejiang University, 2019. [14] ZHANG Q, WEN Y, ZHOU C, et al. Construction of knowledge graphs for maritime dangerous goods[J]. Sustainability, 2019, 11(10): 2849. doi: 10.3390/su11102849 [15] GAN L, YE B, SHU Y, et al. Knowledge graph construction based on ship collision accident reports to improve maritime traffic safety[J]. Ocean & Coastal Management, 240, 106660. [16] ZHONG S, WEN Y, HUANG Y, et al. Ontological ship behavior modeling based on COLREGs for knowledge reasoning[J]. Journal of Marine Science and Engineering. 2022, 10(2), 203. doi: 10.3390/jmse10020203 [17] Willem, Robert, van, et al. Design and use of the Simple Event Model (SEM)[J]. Journal of Web Semantics: Science, Services and Agents on the World Wide web, 2011, 9(2): 128-136. doi: 10.1016/j.websem.2011.03.003 [18] 江玉杰, 万征, 陈继红. 我国沿海水域船舶碰撞事故形态特征分析[J]. 中国安全生产科学技术, 2023, 19(11): 173-179.JIANG Y J, WAN Z, Chen J H. Analysis on morphological Characteristics of ship collision accidents in Chinese coastal waters[J]. Journal of Safety Science and Technology, 2023, 19(11): 173-179. (in Chinese) [19] 殷杰. "桑吉"轮碰撞燃爆事故致因与应急处置的分析与思考[J]. 中国航海, 2019, 42(1): 42-46.YIN J. Analysis and reflection on causation and emergency disposal of "sanchi" crash-blasting accident[J]. Navigation of China, 2019, 42(1): 42-46. (in Chinese) [20] 刘建湘, 陈晓慧, 刘海砚, 等. 基于轨迹语义的船舶活动知识图谱构建[J]. 地球信息科学学报, 2023, 25(6): 1252-1266.LIU J X, CHEN X H, LIU H Y, et al. Construction of ship activity knowledge graph based on trajectory semantics[J]. Journal of Geo-information Science, 2023, 25(6): 1252-1266. (in Chinese) [21] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]. 31st International Conference on Neural Information Processing Systems, California, USA: NIPS, 2017. [22] DEVLIN J, CHANG M W, LEE K, et al. Bert: Pre-trainingof deep bidirectional transformers for language understanding[C]. The 2019 conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, USA: Minneapolis Institute of Art, 2019. [23] LI H, YU L, LYU M, et al. Fusion deep learning and machine learning for multi-source heterogeneous military entity recognition[C]. 2021 IEEE Conference on Telecommunications, Optics and Computer Science (TOCS). Shenyang, China: Zhengzhou University, 2021. [24] HU J, YANG W, YANG H, et al. Named entity recognition Method for power equipment based on BERT-BiLSTM-CRF[C]. 2022 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech), Falerna, Italy: IEEE, 2022. [25] GAN L, CHEN Q, ZHANG D, SHU Y, et al. Construction of knowledge graph for flag state control (Fsc) inspection for ships: a case study from China[J]. Journal of Marine Science and Engineering. 2022, 10(10): 1352. doi: 10.3390/jmse10101352 [26] XIE C, ZHANG L, ZHONG Z. A novel method for constructing spatiotemporal knowledge graph for maritime ship activities[J]. Electronics. 2023, 12(15): 3205. doi: 10.3390/electronics12153205 [27] LIU C, ZHANG X, SHU Y, et al. Knowledge graph for maritime pollution regulations based on deep learning methods[J]. Ocean & Coastal Management, 2023, 242: 106679. [28] CUI Y, CHE W, LIU T, et al. Pre-training with whole word masking for Chinese BERT[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2021, 29: 3504-3514. doi: 10.1109/TASLP.2021.3124365 [29] 刘成勇, 项邦豪, 张东方, 等. 船舶现场监督业务的知识图谱构建方法[J]. 大连海事大学学报, 2022, 48(4): 38-47.LIU C Y, XIANG B H, ZHANG D F, et al. Knowledge graph construction method for ship on-site supervision business[J]. Journal of Dalian Maritime University, 2022, 48(4): 38-47. (in Chinese) -