Cross-Lingual Summarization for Low-Resource Languages Using Multilingual Retrieval-Based In-Context Learning
Cross-lingual summarization (XLS) involves generating a summary in one language from an article written in another language. XLS presents substantial hurdles due to the complex linguistic structures across languages and the challenges in transferring knowledge effectively between them. Although Larg...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2025-07-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/15/14/7800 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1839616605482385408 |
---|---|
author | Gyutae Park Jeonghyun Park Hwanhee Lee |
author_facet | Gyutae Park Jeonghyun Park Hwanhee Lee |
author_sort | Gyutae Park |
collection | DOAJ |
description | Cross-lingual summarization (XLS) involves generating a summary in one language from an article written in another language. XLS presents substantial hurdles due to the complex linguistic structures across languages and the challenges in transferring knowledge effectively between them. Although Large Language Models (LLMs) have demonstrated capabilities in cross-lingual tasks, the integration of retrieval-based in-context learning remains largely unexplored, despite its potential to overcome these linguistic barriers by providing relevant examples. In this paper, we introduce Multilingual Retrieval-based Cross-lingual Summarization (MuRXLS), a robust framework that dynamically selects the most relevant summarization examples for each article using multilingual retrieval. Our method leverages multilingual embedding models to identify contextually appropriate demonstrations for various LLMs. Experiments across twelve XLS setups (six language pairs in both directions) reveal a notable directional asymmetry: our approach significantly outperforms baselines in many-to-one (X→English) scenarios, while showing comparable performance in one-to-many (English→X) directions. We also observe a strong correlation between article-example semantic similarity and summarization quality, demonstrating that intelligently selecting contextually relevant examples substantially improves XLS performance by providing LLMs with more informative demonstrations. |
format | Article |
id | doaj-art-f4ac2aae90dd42c39e7af7cc8b2d1763 |
institution | Matheson Library |
issn | 2076-3417 |
language | English |
publishDate | 2025-07-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj-art-f4ac2aae90dd42c39e7af7cc8b2d17632025-07-25T13:12:24ZengMDPI AGApplied Sciences2076-34172025-07-011514780010.3390/app15147800Cross-Lingual Summarization for Low-Resource Languages Using Multilingual Retrieval-Based In-Context LearningGyutae Park0Jeonghyun Park1Hwanhee Lee2Department of Artificial Intelligence, Chung-Ang University, Seoul 06974, Republic of KoreaDepartment of Artificial Intelligence, Chung-Ang University, Seoul 06974, Republic of KoreaDepartment of Artificial Intelligence, Chung-Ang University, Seoul 06974, Republic of KoreaCross-lingual summarization (XLS) involves generating a summary in one language from an article written in another language. XLS presents substantial hurdles due to the complex linguistic structures across languages and the challenges in transferring knowledge effectively between them. Although Large Language Models (LLMs) have demonstrated capabilities in cross-lingual tasks, the integration of retrieval-based in-context learning remains largely unexplored, despite its potential to overcome these linguistic barriers by providing relevant examples. In this paper, we introduce Multilingual Retrieval-based Cross-lingual Summarization (MuRXLS), a robust framework that dynamically selects the most relevant summarization examples for each article using multilingual retrieval. Our method leverages multilingual embedding models to identify contextually appropriate demonstrations for various LLMs. Experiments across twelve XLS setups (six language pairs in both directions) reveal a notable directional asymmetry: our approach significantly outperforms baselines in many-to-one (X→English) scenarios, while showing comparable performance in one-to-many (English→X) directions. We also observe a strong correlation between article-example semantic similarity and summarization quality, demonstrating that intelligently selecting contextually relevant examples substantially improves XLS performance by providing LLMs with more informative demonstrations.https://www.mdpi.com/2076-3417/15/14/7800cross-lingual summarizationmultilingual retrievalin-context learninglow-resource languageslarge language models |
spellingShingle | Gyutae Park Jeonghyun Park Hwanhee Lee Cross-Lingual Summarization for Low-Resource Languages Using Multilingual Retrieval-Based In-Context Learning Applied Sciences cross-lingual summarization multilingual retrieval in-context learning low-resource languages large language models |
title | Cross-Lingual Summarization for Low-Resource Languages Using Multilingual Retrieval-Based In-Context Learning |
title_full | Cross-Lingual Summarization for Low-Resource Languages Using Multilingual Retrieval-Based In-Context Learning |
title_fullStr | Cross-Lingual Summarization for Low-Resource Languages Using Multilingual Retrieval-Based In-Context Learning |
title_full_unstemmed | Cross-Lingual Summarization for Low-Resource Languages Using Multilingual Retrieval-Based In-Context Learning |
title_short | Cross-Lingual Summarization for Low-Resource Languages Using Multilingual Retrieval-Based In-Context Learning |
title_sort | cross lingual summarization for low resource languages using multilingual retrieval based in context learning |
topic | cross-lingual summarization multilingual retrieval in-context learning low-resource languages large language models |
url | https://www.mdpi.com/2076-3417/15/14/7800 |
work_keys_str_mv | AT gyutaepark crosslingualsummarizationforlowresourcelanguagesusingmultilingualretrievalbasedincontextlearning AT jeonghyunpark crosslingualsummarizationforlowresourcelanguagesusingmultilingualretrievalbasedincontextlearning AT hwanheelee crosslingualsummarizationforlowresourcelanguagesusingmultilingualretrievalbasedincontextlearning |