Comparative analysis of audio-MAE and MAE-AST models for real-time audio classification

Real-time audio classification is a complex process that requires systems to be highly accurate and reduce latency in signal processing. The main challenges include processing large amounts of data, particularly for high-quality audio files, which require significant computing resources. Another imp...

Full description

Saved in:
Bibliographic Details
Main Author: Lesia Mochurad
Format: Article
Language:English
Published: Taylor & Francis Group 2025-07-01
Series:Automatika
Subjects:
Online Access:https://www.tandfonline.com/doi/10.1080/00051144.2025.2504749
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Real-time audio classification is a complex process that requires systems to be highly accurate and reduce latency in signal processing. The main challenges include processing large amounts of data, particularly for high-quality audio files, which require significant computing resources. Another important problem is noise and other interference, which systems must effectively filter without losing useful information. In addition, the diversity of audio signals, such as speech recordings with different accents and tones, requires flexibility and adaptability of classification models. Implementing real-time processing involves optimizing performance to minimize latency, which is critical for responding quickly to incoming data. The ability of systems to adapt in response to new conditions and signals ensures their effectiveness in dynamic environments. This article is devoted to a comparative analysis of Audio-MAE and MAE-AST models, as well as their performance, efficiency, and parallelization capabilities. The paper discusses innovative solutions to overcome the existing challenges aimed at achieving a balance between processing speed and classification accuracy, as well as optimizing the use of hardware resources.
ISSN:0005-1144
1848-3380