Comparative analysis of audio-MAE and MAE-AST models for real-time audio classification
Real-time audio classification is a complex process that requires systems to be highly accurate and reduce latency in signal processing. The main challenges include processing large amounts of data, particularly for high-quality audio files, which require significant computing resources. Another imp...
Saved in:
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
Taylor & Francis Group
2025-07-01
|
Series: | Automatika |
Subjects: | |
Online Access: | https://www.tandfonline.com/doi/10.1080/00051144.2025.2504749 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Real-time audio classification is a complex process that requires systems to be highly accurate and reduce latency in signal processing. The main challenges include processing large amounts of data, particularly for high-quality audio files, which require significant computing resources. Another important problem is noise and other interference, which systems must effectively filter without losing useful information. In addition, the diversity of audio signals, such as speech recordings with different accents and tones, requires flexibility and adaptability of classification models. Implementing real-time processing involves optimizing performance to minimize latency, which is critical for responding quickly to incoming data. The ability of systems to adapt in response to new conditions and signals ensures their effectiveness in dynamic environments. This article is devoted to a comparative analysis of Audio-MAE and MAE-AST models, as well as their performance, efficiency, and parallelization capabilities. The paper discusses innovative solutions to overcome the existing challenges aimed at achieving a balance between processing speed and classification accuracy, as well as optimizing the use of hardware resources. |
---|---|
ISSN: | 0005-1144 1848-3380 |