A method for identifying relevant topics of pilot simulator training based on clustering of flight safety reports

Natural language processing (NLP) technologies, in one of their applications, provide effective research of patterns and trends in large sets of textual data. Textual safety data presented in the form of accident investigation reports is a promising object for extracting new useful information that...

Full description

Saved in:
Bibliographic Details
Main Authors: Z. R. Zabbarov, A. K. Volkov
Format: Article
Language:Russian
Published: Moscow State Technical University of Civil Aviation 2024-08-01
Series:Научный вестник МГТУ ГА
Subjects:
Online Access:https://avia.mstuca.ru/jour/article/view/2400
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1839579104427376640
author Z. R. Zabbarov
A. K. Volkov
author_facet Z. R. Zabbarov
A. K. Volkov
author_sort Z. R. Zabbarov
collection DOAJ
description Natural language processing (NLP) technologies, in one of their applications, provide effective research of patterns and trends in large sets of textual data. Textual safety data presented in the form of accident investigation reports is a promising object for extracting new useful information that can be used both in flight safety management and in the framework of simulator training. This paper discusses the application of NLP technologies for the study of the body of flight safety reports of PJSC Aeroflot – Russian Airlines. The aim of the work is to develop a method for identifying relevant topics of simulator training for pilots. The paper presents an analysis of existing foreign works in the field of intellectual analysis of textual information in civil aviation. It has been revealed that NLP technologies are actively used abroad to study flight safety reports. The paper presents a scheme of a method for identifying relevant topics of pilot simulator training based on clustering of flight safety reports. The procedures of text preprocessing and the construction of its vector space are described. The scientific novelty of the approach is that, unlike previous works, it is proposed to use a full vector representation of flight safety reports, which is built by combining matrices of thematic and semantic vectors. The proposed method has been tested. The analyzed corpus of texts amounted to 1080 reports. As a result of the clustering algorithm, 36 clusters were identified, which were then visualized using the algorithms t-distributed stochastic embedding of neighbors (t-SNE). The practical significance of the research results lies in the fact that the approach based on clustering of reports will allow for a more in-depth analysis of flight safety reports, which can simplify and speed up the work of both safety management specialists and flight simulator instructors.
format Article
id doaj-art-897ba5b28d1d443d9f51e1de0124f8c5
institution Matheson Library
issn 2079-0619
2542-0119
language Russian
publishDate 2024-08-01
publisher Moscow State Technical University of Civil Aviation
record_format Article
series Научный вестник МГТУ ГА
spelling doaj-art-897ba5b28d1d443d9f51e1de0124f8c52025-08-04T10:35:18ZrusMoscow State Technical University of Civil AviationНаучный вестник МГТУ ГА2079-06192542-01192024-08-01274344910.26467/2079-0619-2024-27-4-34-491525A method for identifying relevant topics of pilot simulator training based on clustering of flight safety reportsZ. R. Zabbarov0A. K. Volkov1Public Joint Stock Company “Aeroflot – Russian Airlines”; Ulyanovsk Civil Aviation Institute named after Air Chief Marshal B.P. BugaevUlyanovsk Civil Aviation Institute named after Air Chief Marshal B.P. BugaevNatural language processing (NLP) technologies, in one of their applications, provide effective research of patterns and trends in large sets of textual data. Textual safety data presented in the form of accident investigation reports is a promising object for extracting new useful information that can be used both in flight safety management and in the framework of simulator training. This paper discusses the application of NLP technologies for the study of the body of flight safety reports of PJSC Aeroflot – Russian Airlines. The aim of the work is to develop a method for identifying relevant topics of simulator training for pilots. The paper presents an analysis of existing foreign works in the field of intellectual analysis of textual information in civil aviation. It has been revealed that NLP technologies are actively used abroad to study flight safety reports. The paper presents a scheme of a method for identifying relevant topics of pilot simulator training based on clustering of flight safety reports. The procedures of text preprocessing and the construction of its vector space are described. The scientific novelty of the approach is that, unlike previous works, it is proposed to use a full vector representation of flight safety reports, which is built by combining matrices of thematic and semantic vectors. The proposed method has been tested. The analyzed corpus of texts amounted to 1080 reports. As a result of the clustering algorithm, 36 clusters were identified, which were then visualized using the algorithms t-distributed stochastic embedding of neighbors (t-SNE). The practical significance of the research results lies in the fact that the approach based on clustering of reports will allow for a more in-depth analysis of flight safety reports, which can simplify and speed up the work of both safety management specialists and flight simulator instructors.https://avia.mstuca.ru/jour/article/view/2400flight safetysimulator trainingreportclusteringnatural language processingthematic modelingdoc2vec model
spellingShingle Z. R. Zabbarov
A. K. Volkov
A method for identifying relevant topics of pilot simulator training based on clustering of flight safety reports
Научный вестник МГТУ ГА
flight safety
simulator training
report
clustering
natural language processing
thematic modeling
doc2vec model
title A method for identifying relevant topics of pilot simulator training based on clustering of flight safety reports
title_full A method for identifying relevant topics of pilot simulator training based on clustering of flight safety reports
title_fullStr A method for identifying relevant topics of pilot simulator training based on clustering of flight safety reports
title_full_unstemmed A method for identifying relevant topics of pilot simulator training based on clustering of flight safety reports
title_short A method for identifying relevant topics of pilot simulator training based on clustering of flight safety reports
title_sort method for identifying relevant topics of pilot simulator training based on clustering of flight safety reports
topic flight safety
simulator training
report
clustering
natural language processing
thematic modeling
doc2vec model
url https://avia.mstuca.ru/jour/article/view/2400
work_keys_str_mv AT zrzabbarov amethodforidentifyingrelevanttopicsofpilotsimulatortrainingbasedonclusteringofflightsafetyreports
AT akvolkov amethodforidentifyingrelevanttopicsofpilotsimulatortrainingbasedonclusteringofflightsafetyreports
AT zrzabbarov methodforidentifyingrelevanttopicsofpilotsimulatortrainingbasedonclusteringofflightsafetyreports
AT akvolkov methodforidentifyingrelevanttopicsofpilotsimulatortrainingbasedonclusteringofflightsafetyreports