Automatic Classification of Multilanguage Scientific Papers to the Sustainable Development Goals Using Transfer Learning

The classification of scientific papers according to their relevance to Sustainable Development Goals (SDGs) is a critical task in identifying the research development status of goals. However, with the growing volume of scientific literature published worldwide in multiple languages, manual categor...

Full description

Saved in:
Bibliographic Details
Main Authors: Lya Hulliyyatus Suadaa, Anugerah Karta Monika, Berliana Sugiarti Putri, Yeni Rimawati
Format: Article
Language:English
Published: Ikatan Ahli Informatika Indonesia 2025-06-01
Series:Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)
Subjects:
Online Access:https://jurnal.iaii.or.id/index.php/RESTI/article/view/6560
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The classification of scientific papers according to their relevance to Sustainable Development Goals (SDGs) is a critical task in identifying the research development status of goals. However, with the growing volume of scientific literature published worldwide in multiple languages, manual categorization of these papers has become increasingly complex and time-consuming. Furthermore, the need for a comprehensive multilingual dataset to train effective models complicates the task, as obtaining such datasets for various languages is resource intensive. This study proposes a solution to this problem by leveraging transfer learning techniques to automatically classify scientific papers into SDG labels. By fine-tuning pretrained multilingual models mBERT on SDG publication datasets in a multilabel approach, we demonstrate that transfer learning can significantly improve classification performance, even with limited labelled data, compared to SVM. Our approach enables the effective processing of scientific papers in different languages and facilitates the seamless mapping of research to the relevance of SDGs, the four pillars of SDGs, and the 17 goals of SDGs. The proposed method addresses the scalability issue in SDG classification and lays the groundwork for more efficient systems that can handle the multilingual nature of modern scientific publications.
ISSN:2580-0760