An interpretable multi-transformer ensemble for text-based movie genre classification

Multi-label movie genre classification is challenging due to the inherent ambiguity and overlap between different genres. Most of the existing works in genre classification use audio-visual modalities. The potential of text-based modalities in movie genre classification is still underexplored. This...

Full description

Saved in:

Bibliographic Details
Main Authors:	Faheem Shaukat, Naveed Ejaz, Zeeshan Ashraf, Mrim M. Alnfiai, Nouf Nawar Alotaibi, Salma Mohsen M. Alnefaie
Format:	Article
Language:	English
Published:	PeerJ Inc. 2025-06-01
Series:	PeerJ Computer Science
Subjects:	Movie genre Transformer Textual data
Online Access:	https://peerj.com/articles/cs-2945.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Multi-label movie genre classification is challenging due to the inherent ambiguity and overlap between different genres. Most of the existing works in genre classification use audio-visual modalities. The potential of text-based modalities in movie genre classification is still underexplored. This paper proposes an ensemble deep-learning model that uses movie plots to predict movie genres. After pre-processing the text plots, three transformer-based models, Bidirectional Encoder Representations from Transformers (BERT), DistilBERT, and Robustly Optimized BERT Pre-training Approach (ROBERTa), are used to generate genre predictions, combined through a weighted soft-voting method. The proposed ensemble architecture achieves state-of-the-art performance on two benchmark datasets, Trailers12K and LMTD9, with a micro-average precision of 80.10% and 80.37%, respectively, significantly outperforming both traditional machine learning approaches and advanced deep learning models. The ensemble’s superior performance is attributed to its ability to combine the diverse strengths of individual models and capture nuanced genre-specific information from textual features. The lack of interpretability in deep learning models for genre classification is addressed using Local Interpretable Model-Agnostic Explanations (LIME), which provides both local and global explanations for the model’s predictions. The findings of the study highlight the potential of textual data in automated genre classification and emphasize the importance of interpretability methods in multi-label genre classification.
ISSN:	2376-5992

An interpretable multi-transformer ensemble for text-based movie genre classification

Similar Items