Sentiment Analysis on the PT Pertamina Corruption Case using IndoBERT and RCNN Methods
This study aims to evaluate the performance of a hybrid IndoBERT-RCNN model in classifying public sentiment toward the PT Pertamina corruption case, with a focus on how different hyperparameter combinations affect model accuracy. The dataset consists of 10,078 YouTube comments collected via the YouT...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | Indonesian |
Published: |
Islamic University of Indragiri
2025-09-01
|
Series: | Sistemasi: Jurnal Sistem Informasi |
Subjects: | |
Online Access: | https://sistemasi.ftik.unisi.ac.id/index.php/stmsi/article/view/5392 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | This study aims to evaluate the performance of a hybrid IndoBERT-RCNN model in classifying public sentiment toward the PT Pertamina corruption case, with a focus on how different hyperparameter combinations affect model accuracy. The dataset consists of 10,078 YouTube comments collected via the YouTube Data API, which were then preprocessed, automatically labeled using an Indonesian-language RoBERTa model, and balanced through class distribution techniques including undersampling and contextual embedding-based augmentation with IndoBERT. The model architecture integrates IndoBERT as a feature extractor and RCNN as the classifier, and was tested using various combinations of learning rates and batch sizes. Experimental results show that the optimal configuration was achieved with a learning rate of 2e-5 and a batch size of 16, resulting in an accuracy of 84% and an F1-score of 83%.
While the model demonstrated strong performance in classifying negative comments, accuracy for neutral and positive classes was relatively lower due to semantic overlap and ambiguity in user expressions.
This study contributes to Indonesian-language sentiment analysis by: 1. Integrating the IndoBERT-RCNN architecture for social-political issues, 2. Systematically evaluating hyperparameter combinations for three-class public opinion data, and 3.Utilizing YouTube comments as a relevant source of informal public discourse. The findings have potential applications in real-time digital public opinion monitoring systems for strategic national issues. |
---|---|
ISSN: | 2302-8149 2540-9719 |