Leveraging text semantics for enhanced scene text image super-resolution
In recent years, due to the development of neural networks, super-resolution technology has made unprecedented progress. However, most existing super-resolution methods treat scene text images as normal images, ignoring the text information within them. This paper proposes to incorporate categorical...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Tsinghua University Press
2025-06-01
|
Series: | Intelligent and Converged Networks |
Subjects: | |
Online Access: | https://www.sciopen.com/article/10.23919/ICN.2025.0009 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | In recent years, due to the development of neural networks, super-resolution technology has made unprecedented progress. However, most existing super-resolution methods treat scene text images as normal images, ignoring the text information within them. This paper proposes to incorporate categorical priors specific to the text in scene text image super-resolution model training process, called two-stage text prior super-resolution (TTPSR) framework. The TTPSR framework first employs a parallel context attention network to restore the low-resolution image without incorporating text priors. Then, the obtained image is used for text recognition to obtain the text prior. The attention mechanism is then used to fuse the text prior and image features to guide the generation of the final high-resolution text image. Experiments have shown that the TTPSR model outperforms relevant state-of-the-art models in relevant metrics, peak signal-to-noise ratio, structural similarity index, and text accuracy, on the Textzoom dataset. |
---|---|
ISSN: | 2708-6240 |