Cross-Lingual Summarization for Low-Resource Languages Using Multilingual Retrieval-Based In-Context Learning

Cross-lingual summarization (XLS) involves generating a summary in one language from an article written in another language. XLS presents substantial hurdles due to the complex linguistic structures across languages and the challenges in transferring knowledge effectively between them. Although Larg...

Full description

Saved in:
Bibliographic Details
Main Authors: Gyutae Park, Jeonghyun Park, Hwanhee Lee
Format: Article
Language:English
Published: MDPI AG 2025-07-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/14/7800
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1839616605482385408
author Gyutae Park
Jeonghyun Park
Hwanhee Lee
author_facet Gyutae Park
Jeonghyun Park
Hwanhee Lee
author_sort Gyutae Park
collection DOAJ
description Cross-lingual summarization (XLS) involves generating a summary in one language from an article written in another language. XLS presents substantial hurdles due to the complex linguistic structures across languages and the challenges in transferring knowledge effectively between them. Although Large Language Models (LLMs) have demonstrated capabilities in cross-lingual tasks, the integration of retrieval-based in-context learning remains largely unexplored, despite its potential to overcome these linguistic barriers by providing relevant examples. In this paper, we introduce Multilingual Retrieval-based Cross-lingual Summarization (MuRXLS), a robust framework that dynamically selects the most relevant summarization examples for each article using multilingual retrieval. Our method leverages multilingual embedding models to identify contextually appropriate demonstrations for various LLMs. Experiments across twelve XLS setups (six language pairs in both directions) reveal a notable directional asymmetry: our approach significantly outperforms baselines in many-to-one (X→English) scenarios, while showing comparable performance in one-to-many (English→X) directions. We also observe a strong correlation between article-example semantic similarity and summarization quality, demonstrating that intelligently selecting contextually relevant examples substantially improves XLS performance by providing LLMs with more informative demonstrations.
format Article
id doaj-art-f4ac2aae90dd42c39e7af7cc8b2d1763
institution Matheson Library
issn 2076-3417
language English
publishDate 2025-07-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj-art-f4ac2aae90dd42c39e7af7cc8b2d17632025-07-25T13:12:24ZengMDPI AGApplied Sciences2076-34172025-07-011514780010.3390/app15147800Cross-Lingual Summarization for Low-Resource Languages Using Multilingual Retrieval-Based In-Context LearningGyutae Park0Jeonghyun Park1Hwanhee Lee2Department of Artificial Intelligence, Chung-Ang University, Seoul 06974, Republic of KoreaDepartment of Artificial Intelligence, Chung-Ang University, Seoul 06974, Republic of KoreaDepartment of Artificial Intelligence, Chung-Ang University, Seoul 06974, Republic of KoreaCross-lingual summarization (XLS) involves generating a summary in one language from an article written in another language. XLS presents substantial hurdles due to the complex linguistic structures across languages and the challenges in transferring knowledge effectively between them. Although Large Language Models (LLMs) have demonstrated capabilities in cross-lingual tasks, the integration of retrieval-based in-context learning remains largely unexplored, despite its potential to overcome these linguistic barriers by providing relevant examples. In this paper, we introduce Multilingual Retrieval-based Cross-lingual Summarization (MuRXLS), a robust framework that dynamically selects the most relevant summarization examples for each article using multilingual retrieval. Our method leverages multilingual embedding models to identify contextually appropriate demonstrations for various LLMs. Experiments across twelve XLS setups (six language pairs in both directions) reveal a notable directional asymmetry: our approach significantly outperforms baselines in many-to-one (X→English) scenarios, while showing comparable performance in one-to-many (English→X) directions. We also observe a strong correlation between article-example semantic similarity and summarization quality, demonstrating that intelligently selecting contextually relevant examples substantially improves XLS performance by providing LLMs with more informative demonstrations.https://www.mdpi.com/2076-3417/15/14/7800cross-lingual summarizationmultilingual retrievalin-context learninglow-resource languageslarge language models
spellingShingle Gyutae Park
Jeonghyun Park
Hwanhee Lee
Cross-Lingual Summarization for Low-Resource Languages Using Multilingual Retrieval-Based In-Context Learning
Applied Sciences
cross-lingual summarization
multilingual retrieval
in-context learning
low-resource languages
large language models
title Cross-Lingual Summarization for Low-Resource Languages Using Multilingual Retrieval-Based In-Context Learning
title_full Cross-Lingual Summarization for Low-Resource Languages Using Multilingual Retrieval-Based In-Context Learning
title_fullStr Cross-Lingual Summarization for Low-Resource Languages Using Multilingual Retrieval-Based In-Context Learning
title_full_unstemmed Cross-Lingual Summarization for Low-Resource Languages Using Multilingual Retrieval-Based In-Context Learning
title_short Cross-Lingual Summarization for Low-Resource Languages Using Multilingual Retrieval-Based In-Context Learning
title_sort cross lingual summarization for low resource languages using multilingual retrieval based in context learning
topic cross-lingual summarization
multilingual retrieval
in-context learning
low-resource languages
large language models
url https://www.mdpi.com/2076-3417/15/14/7800
work_keys_str_mv AT gyutaepark crosslingualsummarizationforlowresourcelanguagesusingmultilingualretrievalbasedincontextlearning
AT jeonghyunpark crosslingualsummarizationforlowresourcelanguagesusingmultilingualretrievalbasedincontextlearning
AT hwanheelee crosslingualsummarizationforlowresourcelanguagesusingmultilingualretrievalbasedincontextlearning