|本期目录/Table of Contents|

[1]王明达,张榜,吴志生,等.基于强化学习的城镇燃气事故信息抽取方法[J].中国安全生产科学技术,2023,19(3):39-45.[doi:10.11731/j.issn.1673-193x.2023.03.006]
 WANG Mingda,ZHANG Bang,WU Zhisheng,et al.Information extraction method of urban gas accidents based on reinforcement learning[J].JOURNAL OF SAFETY SCIENCE AND TECHNOLOGY,2023,19(3):39-45.[doi:10.11731/j.issn.1673-193x.2023.03.006]
点击复制

基于强化学习的城镇燃气事故信息抽取方法
分享到:

《中国安全生产科学技术》[ISSN:1673-193X/CN:11-5335/TB]

卷:
19
期数:
2023年3期
页码:
39-45
栏目:
学术论著
出版日期:
2023-03-31

文章信息/Info

Title:
Information extraction method of urban gas accidents based on reinforcement learning
文章编号:
1673-193X(2023)-03-0039-07
作者:
王明达张榜吴志生李云飞
(中国石油大学(华东) 机电工程学院,山东 青岛 266580)
Author(s):
WANG Mingda ZHANG Bang WU Zhisheng LI Yunfei
(College of Mechanical and Electrical Engineering,China University of Petroleum(East China),Qingdao Shandong 266580,China)
关键词:
城镇燃气事故命名实体识别信息抽取强化学习
Keywords:
urban gas accidentnamed entity recognitioninformation extractionreinforcement learning
分类号:
X937
DOI:
10.11731/j.issn.1673-193x.2023.03.006
文献标志码:
A
摘要:
为解决因城镇燃气事故调查报告标注样本缺乏,从而影响命名实体识别性能这一问题,提出基于BiLSTM-CRF+强化学习的燃气事故领域命名实体识别方法。首先在数据预处理阶段,采用基于文本结构的主旨段落抽取方法,识别事故调查报告的关键段落;其次在模型训练阶段,采用BiLSTM-CRF+强化学习模型,实现城镇燃气事故命名实体识别模型训练;最后利用城镇燃气事故调查报告作为试验数据进行验证。研究结果表明:经由强化学习模型降噪后,实体识别模型的综合评价指标提高5.76%,主旨段落识别方法相比Word2vec特征表示方法,使模型的综合评价指标提升7.17%。
Abstract:
In order to solve the problem that the lack of marked samples of the urban gas accident investigation reports affect the performance of named entity recognition,a named entity recognition method of gas accident field based on bidirectional long short term memory/conditional random fields (BiLSTM-CRF) and reinforcement learning was proposed.Firstly,in the data pre-processing stage,the theme paragraph extraction method based on the text structure was adopted to identify the key paragraphs of accident investigation reports.Secondly,in the model training stage,the BiLSTM-CRFand reinforcement learning model were used to train the named entityrecognition model of urban gas accidents.Finally,the urban gas accident investigation reports were taken as the test data for experimental validation.The results showed that the comprehensive evaluation index of the entity recognition model improved by 5.76% after the noise reduction by the reinforcement learning model,and the themeparagraph recognition method could improve the comprehensive evaluation index of the model by 7.17% compared with the Word2vec feature representation method.

参考文献/References:

[1]夏光辉.基于词典与机器学习的基因命名实体识别机制研究[D].北京:北京协和医学院,2013.
[2]王世民.基于深度学习的中文电子病历命名实体识别研究[D].武汉:华中科技大学,2020.
[3]翟菊叶,陈春燕,张钰,等.基于CRF与规则相结合的中文电子病历命名实体识别研究[J].包头医学院学报,2017,33(11):124-125,130. ZHAI Juye,CHENG Chunyan,ZHANG Yu,et al.A study on the recognition of named entities in Chinese electronic medical records based on a combination of CRF and rules[J].Joural of Baotou Medical College,2017,33(11):124-125,130.
[4]张鹏翔.多维字符特征表示的铁路设备事故信息抽取方法[J].中国安全科学学报,2022,32(6):109-114. ZHANG Pengxiang.Multi-dimensional character feature representation of railway equipment accident information extraction method[J].China Safety Science Journal,2022,32(6):109-114.
[5]王红,祝寒,林海舟.航空安全事故因果关系抽取方法的研究[J].计算机工程与应用,2020,56(11):265-270. WANG Hong,ZHU Han,LIN Haizhou.A study of causality extraction methods for aviation safety incidents[J].Computer Engineering and Applications,2020,56(11):265-270.
[6]牛毅,樊运晓,高远.基于数据挖掘的化工生产事故致因主题抽取[J].中国安全生产科学技术,2019,15(10):165-170. NIU Yi,FAN Yunxiao,GAO Yuan.Data mining-based extraction of causal themes for chemical production accidents[J].Journal of Safety Science and Technology,2019,15(10):165-170.
[7]QIN Y,ZENG Y.Research of clinical named entity recognition based on Bi-LSTM-CRF[J].Journal of Shanghai Jiao Tong University(Science),2018,23(3):392-397.
[8]LI X,SHI T,LI P,et al.BiLSTM-CRF model for named entity recognition in railway accident and fault analysis report[C]//The Asia-Pacific Conference on Intelligent Medical 2018 & International Conference on Transportation and Traffic Engineering 2018,Beijing,2018.
[9]杨连报.铁路事故故障文本大数据分析关键技术研究及应用[D].北京:中国铁道科学研究院,2018.
[10]LEI T,BARZILAY R,JAAKKOLA T.Molding CNNs for text:non-linear,non-consecutive convolutions[J].Indiana University Mathematics Journal,2015,58(3):1151-1186.
[11]WANG C,JIANG F,YANG H.A hybrid framework for text modeling with convolutional RNN[C]//The 23rd ACM SIGKDD International Conference,Hangzhou,2017.
[12]QIN Y,SHEN G W,ZHAO W B,et al.A network security entity recognition method based on feature template and CNN-BiLSTM-CRF[J].Frontiers of Information Technology and Electronic Engineering,2019,20(6):13-19.
[13]胡吉明,钱玮,文鹏,等.基于结构功能和实体识别的文本语义表示—以病历领域为例[J].数据分析与知识发现,2022,8(6):1-15. HU Jiming,QIAN Wei,WEN Peng,et al.Text semantic representation based on structure-function and entity recognition:case study on medical records[J].Data Analysis and Knowledge Discovery,2022,8(6):1-15.
[14]SALAKHUTDINOV R,HINTON G E.Replicated softmax:an undirected topic model[C]//International Conference on Neural Information Processing Systems,Vancouver:2009.
[15]曾佐祺,李赞.基于Viterbi算法的GMSK信号解调性能分析与仿真[J].重庆邮电大学学报,2008,20(2):7-11 ZENG Zuoqi,LI Zan.Analysis and simulation of demodulation performance of GMSK signals based on Viterbi algorithm[J].Journal of Chongqing University of Posts and Telecommunications,2008,20(2):7-11.
[16]VOLODYMYR M,KORAY K,DAVID S,et al.Human-level control through deep reinforcement learning[J].Nature,2015,518(7540):529-533.
[17]王悦.基于强化学习的生物医学实体识别研究与应用[D].大连:大连理工大学,2020.

相似文献/References:

[1]牛飞,钟少波,刘楠,等.一种改进的灾害新闻3要素提取方法研究*[J].中国安全生产科学技术,2023,19(2):13.[doi:10.11731/j.issn.1673-193x.2023.02.002]
 NIU Fei,ZHONG Shaobo,LIU Nan,et al.Research on an improved extraction method for three elements of disaster news[J].JOURNAL OF SAFETY SCIENCE AND TECHNOLOGY,2023,19(3):13.[doi:10.11731/j.issn.1673-193x.2023.02.002]
[2]成全,张双宝.基于深度学习的特征增强式安全事故文本实体识别模型研究*[J].中国安全生产科学技术,2024,20(6):58.[doi:10.11731/j.issn.1673-193x.2024.06.008]
 CHENG Quan,ZHANG Shuangbao.Research on feature-enhanced model for entity recognition of safety accident text based on deep learning[J].JOURNAL OF SAFETY SCIENCE AND TECHNOLOGY,2024,20(3):58.[doi:10.11731/j.issn.1673-193x.2024.06.008]
[3]王明达,赵宝熙,吴志生,等.基于大语言模型的燃气事故调查报告实体识别*[J].中国安全生产科学技术,2025,21(2):139.[doi:10.11731/j.issn.1673-193x.2025.02.018]
 WANG Mingda,ZHAO Baoxi,WU Zhisheng,et al.Entity recognition of gas accident investigation reports based on large language model[J].JOURNAL OF SAFETY SCIENCE AND TECHNOLOGY,2025,21(3):139.[doi:10.11731/j.issn.1673-193x.2025.02.018]

备注/Memo

备注/Memo:
收稿日期: 2022-09-11
作者简介: 王明达,博士,讲师,主要研究方向为油气安全大数据、安全工程信息化教学与科研。
通信作者: 张榜,硕士研究生,主要研究方向为安全工程信息化。
更新日期/Last Update: 2023-04-12