Appearance consistency and motion coherence learning for internal video inpainting

Abstract Internal learning‐based video inpainting methods have shown promising results by exploiting the intrinsic properties of the video to fill in the missing region without external dataset supervision. However, existing internal learning‐based video inpainting methods would produce inconsistent...

Full description

Saved in:
Bibliographic Details
Main Authors: Ruixin Liu, Yuesheng Zhu, GuiBo Luo
Format: Article
Language:English
Published: Wiley 2025-06-01
Series:CAAI Transactions on Intelligence Technology
Subjects:
Online Access:https://doi.org/10.1049/cit2.12405
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Internal learning‐based video inpainting methods have shown promising results by exploiting the intrinsic properties of the video to fill in the missing region without external dataset supervision. However, existing internal learning‐based video inpainting methods would produce inconsistent structures or blurry textures due to the insufficient utilisation of motion priors within the video sequence. In this paper, the authors propose a new internal learning‐based video inpainting model called appearance consistency and motion coherence network (ACMC‐Net), which can not only learn the recurrence of appearance prior but can also capture motion coherence prior to improve the quality of the inpainting results. In ACMC‐Net, a transformer‐based appearance network is developed to capture global context information within the video frame for representing appearance consistency accurately. Additionally, a novel motion coherence learning scheme is proposed to learn the motion prior in a video sequence effectively. Finally, the learnt internal appearance consistency and motion coherence are implicitly propagated to the missing regions to achieve inpainting well. Extensive experiments conducted on the DAVIS dataset show that the proposed model obtains the superior performance in terms of quantitative measurements and produces more visually plausible results compared with the state‐of‐the‐art methods.
ISSN:2468-2322