An interpretable multi-transformer ensemble for text-based movie genre classification
Multi-label movie genre classification is challenging due to the inherent ambiguity and overlap between different genres. Most of the existing works in genre classification use audio-visual modalities. The potential of text-based modalities in movie genre classification is still underexplored. This...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
PeerJ Inc.
2025-06-01
|
Series: | PeerJ Computer Science |
Subjects: | |
Online Access: | https://peerj.com/articles/cs-2945.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Multi-label movie genre classification is challenging due to the inherent ambiguity and overlap between different genres. Most of the existing works in genre classification use audio-visual modalities. The potential of text-based modalities in movie genre classification is still underexplored. This paper proposes an ensemble deep-learning model that uses movie plots to predict movie genres. After pre-processing the text plots, three transformer-based models, Bidirectional Encoder Representations from Transformers (BERT), DistilBERT, and Robustly Optimized BERT Pre-training Approach (ROBERTa), are used to generate genre predictions, combined through a weighted soft-voting method. The proposed ensemble architecture achieves state-of-the-art performance on two benchmark datasets, Trailers12K and LMTD9, with a micro-average precision of 80.10% and 80.37%, respectively, significantly outperforming both traditional machine learning approaches and advanced deep learning models. The ensemble’s superior performance is attributed to its ability to combine the diverse strengths of individual models and capture nuanced genre-specific information from textual features. The lack of interpretability in deep learning models for genre classification is addressed using Local Interpretable Model-Agnostic Explanations (LIME), which provides both local and global explanations for the model’s predictions. The findings of the study highlight the potential of textual data in automated genre classification and emphasize the importance of interpretability methods in multi-label genre classification. |
---|---|
ISSN: | 2376-5992 |