Sentiment Analysis on the PT Pertamina Corruption Case using IndoBERT and RCNN Methods

This study aims to evaluate the performance of a hybrid IndoBERT-RCNN model in classifying public sentiment toward the PT Pertamina corruption case, with a focus on how different hyperparameter combinations affect model accuracy. The dataset consists of 10,078 YouTube comments collected via the YouT...

Full description

Saved in:
Bibliographic Details
Main Authors: Wildan Jaya Kusoema, Ichsan Ibrahim
Format: Article
Language:Indonesian
Published: Islamic University of Indragiri 2025-09-01
Series:Sistemasi: Jurnal Sistem Informasi
Subjects:
Online Access:https://sistemasi.ftik.unisi.ac.id/index.php/stmsi/article/view/5392
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This study aims to evaluate the performance of a hybrid IndoBERT-RCNN model in classifying public sentiment toward the PT Pertamina corruption case, with a focus on how different hyperparameter combinations affect model accuracy. The dataset consists of 10,078 YouTube comments collected via the YouTube Data API, which were then preprocessed, automatically labeled using an Indonesian-language RoBERTa model, and balanced through class distribution techniques including undersampling and contextual embedding-based augmentation with IndoBERT. The model architecture integrates IndoBERT as a feature extractor and RCNN as the classifier, and was tested using various combinations of learning rates and batch sizes. Experimental results show that the optimal configuration was achieved with a learning rate of 2e-5 and a batch size of 16, resulting in an accuracy of 84% and an F1-score of 83%. While the model demonstrated strong performance in classifying negative comments, accuracy for neutral and positive classes was relatively lower due to semantic overlap and ambiguity in user expressions. This study contributes to Indonesian-language sentiment analysis by: 1. Integrating the IndoBERT-RCNN architecture for social-political issues, 2. Systematically evaluating hyperparameter combinations for three-class public opinion data, and 3.Utilizing YouTube comments as a relevant source of informal public discourse. The findings have potential applications in real-time digital public opinion monitoring systems for strategic national issues.
ISSN:2302-8149
2540-9719