A method for identifying relevant topics of pilot simulator training based on clustering of flight safety reports
Natural language processing (NLP) technologies, in one of their applications, provide effective research of patterns and trends in large sets of textual data. Textual safety data presented in the form of accident investigation reports is a promising object for extracting new useful information that...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | Russian |
Published: |
Moscow State Technical University of Civil Aviation
2024-08-01
|
Series: | Научный вестник МГТУ ГА |
Subjects: | |
Online Access: | https://avia.mstuca.ru/jour/article/view/2400 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1839579104427376640 |
---|---|
author | Z. R. Zabbarov A. K. Volkov |
author_facet | Z. R. Zabbarov A. K. Volkov |
author_sort | Z. R. Zabbarov |
collection | DOAJ |
description | Natural language processing (NLP) technologies, in one of their applications, provide effective research of patterns and trends in large sets of textual data. Textual safety data presented in the form of accident investigation reports is a promising object for extracting new useful information that can be used both in flight safety management and in the framework of simulator training. This paper discusses the application of NLP technologies for the study of the body of flight safety reports of PJSC Aeroflot – Russian Airlines. The aim of the work is to develop a method for identifying relevant topics of simulator training for pilots. The paper presents an analysis of existing foreign works in the field of intellectual analysis of textual information in civil aviation. It has been revealed that NLP technologies are actively used abroad to study flight safety reports. The paper presents a scheme of a method for identifying relevant topics of pilot simulator training based on clustering of flight safety reports. The procedures of text preprocessing and the construction of its vector space are described. The scientific novelty of the approach is that, unlike previous works, it is proposed to use a full vector representation of flight safety reports, which is built by combining matrices of thematic and semantic vectors. The proposed method has been tested. The analyzed corpus of texts amounted to 1080 reports. As a result of the clustering algorithm, 36 clusters were identified, which were then visualized using the algorithms t-distributed stochastic embedding of neighbors (t-SNE). The practical significance of the research results lies in the fact that the approach based on clustering of reports will allow for a more in-depth analysis of flight safety reports, which can simplify and speed up the work of both safety management specialists and flight simulator instructors. |
format | Article |
id | doaj-art-897ba5b28d1d443d9f51e1de0124f8c5 |
institution | Matheson Library |
issn | 2079-0619 2542-0119 |
language | Russian |
publishDate | 2024-08-01 |
publisher | Moscow State Technical University of Civil Aviation |
record_format | Article |
series | Научный вестник МГТУ ГА |
spelling | doaj-art-897ba5b28d1d443d9f51e1de0124f8c52025-08-04T10:35:18ZrusMoscow State Technical University of Civil AviationНаучный вестник МГТУ ГА2079-06192542-01192024-08-01274344910.26467/2079-0619-2024-27-4-34-491525A method for identifying relevant topics of pilot simulator training based on clustering of flight safety reportsZ. R. Zabbarov0A. K. Volkov1Public Joint Stock Company “Aeroflot – Russian Airlines”; Ulyanovsk Civil Aviation Institute named after Air Chief Marshal B.P. BugaevUlyanovsk Civil Aviation Institute named after Air Chief Marshal B.P. BugaevNatural language processing (NLP) technologies, in one of their applications, provide effective research of patterns and trends in large sets of textual data. Textual safety data presented in the form of accident investigation reports is a promising object for extracting new useful information that can be used both in flight safety management and in the framework of simulator training. This paper discusses the application of NLP technologies for the study of the body of flight safety reports of PJSC Aeroflot – Russian Airlines. The aim of the work is to develop a method for identifying relevant topics of simulator training for pilots. The paper presents an analysis of existing foreign works in the field of intellectual analysis of textual information in civil aviation. It has been revealed that NLP technologies are actively used abroad to study flight safety reports. The paper presents a scheme of a method for identifying relevant topics of pilot simulator training based on clustering of flight safety reports. The procedures of text preprocessing and the construction of its vector space are described. The scientific novelty of the approach is that, unlike previous works, it is proposed to use a full vector representation of flight safety reports, which is built by combining matrices of thematic and semantic vectors. The proposed method has been tested. The analyzed corpus of texts amounted to 1080 reports. As a result of the clustering algorithm, 36 clusters were identified, which were then visualized using the algorithms t-distributed stochastic embedding of neighbors (t-SNE). The practical significance of the research results lies in the fact that the approach based on clustering of reports will allow for a more in-depth analysis of flight safety reports, which can simplify and speed up the work of both safety management specialists and flight simulator instructors.https://avia.mstuca.ru/jour/article/view/2400flight safetysimulator trainingreportclusteringnatural language processingthematic modelingdoc2vec model |
spellingShingle | Z. R. Zabbarov A. K. Volkov A method for identifying relevant topics of pilot simulator training based on clustering of flight safety reports Научный вестник МГТУ ГА flight safety simulator training report clustering natural language processing thematic modeling doc2vec model |
title | A method for identifying relevant topics of pilot simulator training based on clustering of flight safety reports |
title_full | A method for identifying relevant topics of pilot simulator training based on clustering of flight safety reports |
title_fullStr | A method for identifying relevant topics of pilot simulator training based on clustering of flight safety reports |
title_full_unstemmed | A method for identifying relevant topics of pilot simulator training based on clustering of flight safety reports |
title_short | A method for identifying relevant topics of pilot simulator training based on clustering of flight safety reports |
title_sort | method for identifying relevant topics of pilot simulator training based on clustering of flight safety reports |
topic | flight safety simulator training report clustering natural language processing thematic modeling doc2vec model |
url | https://avia.mstuca.ru/jour/article/view/2400 |
work_keys_str_mv | AT zrzabbarov amethodforidentifyingrelevanttopicsofpilotsimulatortrainingbasedonclusteringofflightsafetyreports AT akvolkov amethodforidentifyingrelevanttopicsofpilotsimulatortrainingbasedonclusteringofflightsafetyreports AT zrzabbarov methodforidentifyingrelevanttopicsofpilotsimulatortrainingbasedonclusteringofflightsafetyreports AT akvolkov methodforidentifyingrelevanttopicsofpilotsimulatortrainingbasedonclusteringofflightsafetyreports |