|本期目录/Table of Contents|

[1]肖琪耀,贾宝山,徐以诺,等.基于深度学习模型的煤矿安全隐患数据主题挖掘[J].中国安全生产科学技术,2024,20(4):49-55.[doi:10.11731/j.issn.1673-193x.2024.04.007]
 XIAO Qiyao,JIA Baoshan,XU Yinuo,et al.Topics mining on potential safety hazard data of coal mine based on deep learning models[J].JOURNAL OF SAFETY SCIENCE AND TECHNOLOGY,2024,20(4):49-55.[doi:10.11731/j.issn.1673-193x.2024.04.007]
点击复制

基于深度学习模型的煤矿安全隐患数据主题挖掘
分享到:

《中国安全生产科学技术》[ISSN:1673-193X/CN:11-5335/TB]

卷:
20
期数:
2024年4期
页码:
49-55
栏目:
学术论著
出版日期:
2024-04-30

文章信息/Info

Title:
Topics mining on potential safety hazard data of coal mine based on deep learning models
文章编号:
1673-193X(2024)-04-0049-07
作者:
肖琪耀贾宝山徐以诺张茂薇梁明辉
(1.辽宁工程技术大学 矿业学院,辽宁 阜新 123000;
2.煤矿火灾及瓦斯防控国家矿山安全监察局重点实验室,辽宁 抚顺 113000;
3.辽宁工程技术大学 安全科学与工程学院,辽宁 阜新 123000)
Author(s):
XIAO Qiyao JIA Baoshan XU Yinuo ZHANG Maowei LIANG Minghui
(1.School of Mining,Liaoning Technical University,Fuxin Liaoning 123000,China;
2.Key Laboratory of Coal Mine Fire and Gas Prevention and Control,National Mine Safety Administration,Fushun Liaoning 113000,China;
3.School of Safety Science and Engineering,Liaoning Technical University,Fuxin Liaoning 123000,China)
关键词:
煤矿安全隐患BiLSTMCRFLDA困惑度-主题方差
Keywords:
potential safety hazard data of coal mine bi-directional long short-term memory (BiLSTM) conditional random field (CRF) latent Dirichlet allocation (LDA) perplexity-topic variance
分类号:
X936
DOI:
10.11731/j.issn.1673-193x.2024.04.007
文献标志码:
A
摘要:
为了提高煤矿安全风险排查能力和监督能力,提出1种基于双向长短期记忆网络(BiLSTM)、条件随机场(CRF)和隐含狄利克雷分布(LDA)的模型。训练BiLSTM-CRF模型分词,采用困惑度-主题方差(perplexity-var)计算LDA模型最优主题数,构建BiLSTM-CRF-LDA模型挖掘内蒙古某煤矿安全隐患数据。研究结果表明:困惑度-主题方差指标能更准确地确定主题数;BiLSTM-CRF模型分词结果比jieba库更准确;BiLSTM-CRF-LDA模型能准确地挖掘出煤矿安全隐患类型、安全隐患空间分布和安全责任划分。研究结果可为煤矿安全风险排查与监督提供参考。
Abstract:
To enhance coal mine safety risk identification and supervision capabilities,a model based on Bidirectional Long Short-Term Memory Networks (BiLSTM),Conditional Random Fields (CRF),and Latent Dirichlet Allocation (LDA) is proposed.The BiLSTM-CRF model is trained to split words;the perplexity-var is used to calculate the optimal number of topics for the LDA model;and the BiLSTM-CRF-LDA model is constructed to mine the data of safety hazards in a coal mine in Inner Mongolia.The research findings indicate that the perplexity-variance metric can more accurately determine the number of topics; the word segmentation results of the BiLSTM-CRF model are more precise compared to those of the jieba library;the BiLSTM-CRF-LDA model can accurately identify types of safety hazards,spatial distribution of safety hazards,and the allocation of safety responsibilities in coal mines.These results can provide a reference for coal mine safety risk examination and supervision.

参考文献/References:

[1]尹志民,赵作鹏,刘韵.煤矿隐患排查信息平台的设计[J].煤矿安全,2009,40(8):68-70. YIN Zhimin,ZHAO Zuopeng,LIU Yun.Design of information platform for hidden danger investigation in coal mine[J].Safety in Coal Mines,2009,40(8):68-70.
[2]CHANG J,GERRISH S,WANG C,et al.Reading tea leaves:how humans interpret topic models[J].Advances in Neural Information Processing Systems,2009:288-296.
[3]刘金硕,彭映月,章岚昕,等.网络食品安全问题话题发现的LDA-K-means算法[J].武汉大学学报(工学版),2017,50(2):307-310. LIU Jinshuo,PENG Yingyue,ZHANG Lanxin,et al.LDA-K-means algorithm for topic discovery of online food safety issues[J].Engineering Journal of Wuhan University,2017,50(2):307-310.
[4]WANG Z,LI H,TANG R.Network analysis of coal mine hazards based on text mining and link prediction[J].International Journal of Modern Physics C,2019,30(7):1940009-1940029.
[5]詹平,刘飞翔,赵嘉良.基于LDA和ARIMA模型的煤矿安全隐患数量预测研究[J].煤,2024,33(3):39-44. ZHAN Ping,LIU Feixiang,ZHAO Jialiang.Text mining and analysis of coal mine safety hazards based on LDA and ARIMA models[J].Coal,2024,33(3):39-44.
[6]BLEID M,NGA Y,JORDANM I.Latent dirichlet allocation [J].Journal of Machine Learning Research,2003(3):993-1022.
[7]GRIFFITHS T L,STEYVERS M.Finding scientific topics[J].Proceedings of the National academy of Sciences,2004,101(Supplement 1):5228-5235.
[8]ARUN R,SURESH V,VENIMADHAVAN C E,et al.On finding the natural number of topics with latent dirichlet allocation:some observations[J].Advances in Knowledge Discovery and Data Mining.2010:391-402.
[9]关鹏,王曰芬.科技情报分析中LDA主题模型最优主题数确定方法研究[J].现代图书情报技术,2016(9):42-50. GUAN Peng,WANG Yuefen.Identifying optimal topic numbers from Sci-Tech information with LDA model[J].New Technology of Library and Information Service,2016(9):42-50.
[10]武惠,吕立,于碧辉.基于迁移学习和BiLSTM-CRF的中文命名实体识别[J].小型微型计算机系统,2019,40(6):1142-1147. WU Hui,LYU Li,YU Bihui.Chinese named enity recognition based on transfer learning and BiLSTM-CRF[J].Journal of Chinese Mini-Micro Computer Systems,2019,40(6):1142-1147.
[11]WU Z H,HUANG N E.Ensemble empirical mode decomposition:a noise-assisted data analysis method[J].Advances in Adaptive Data Analysis,2009,1(1):1-41.
[12]GLIGIC L,KORMILITZIN A,GOLDBERG P,et al.Named entity recognition in electronic health records using transfer learning bootstrapped neural networks[J].Neural Networks,2020,121:132-139.
[13]GIORGI J M,BADER G D.Transfer learning for biomedical named entity recognition with neural networks[J].Bioinformatics,2018,34(23):4087-4094.
[14]王莉莉,王宏渊,白玛曲珍,等.基于BiLSTM-CRF模型的藏文分词方法[J].重庆邮电大学学报(自然科学版),2020,32(4):648-654. WANG Lili,WANG Hongyuan,BAIMA Quzhen,et al.Tibetan word segmentation method based on BiLSTM-CRF model[J].Journal of Chongqing University of Posts and Telecommunications (Natural Science Edition),2020,32(4):648-654.
[15]王婷婷,韩满,王宇.LDA模型的优化及其主题数量选择研究—以科技文献为例[J].数据分析与知识发现,2018,2(1):29-40. WANG Tingting,HAN Man,WANG Yu.Research optimizing LDA model with various topic numbers: case study of scientific literature[J].Data Analysis and Knowledge Discovery,2018,2(1):29-40.
[16]张应成,杨洋,蒋瑞,等.基于BiLSTM-CRF的商情实体识别模型[J].计算机工程,2019,45(5):308-314. ZHANG Yingcheng,YANG Yang,JIANG Rui,et al.Commercial intelligence entity recognition model based on BiLSTM-CRF[J].Computer Engineering,2019,45(5):308-314.
[17]LUO L,YANG Z,YANG P,et al.An attention-based BiLSTM-CRF approach to document-level chemical named entity recognition[J].Bioinformatics,2018,34(8):1381-1388.

相似文献/References:

备注/Memo

备注/Memo:
收稿日期: 2023-09-14
作者简介: 肖琪耀,硕士研究生,主要研究方向为数据挖掘与煤矿安全。
通信作者: 贾宝山,博士,教授,主要研究方向为矿井通风与安全、矿井灾害防治。
更新日期/Last Update: 2024-05-09