GenConViT: Deepfake Video Detection Using Generative Convolutional Vision Transformer

Deepfakes have raised significant concerns due to their potential to spread false information and compromise the integrity of digital media. Current deepfake detection models often struggle to generalize across a diverse range of deepfake generation techniques and video content. In this work, we pro...

Full description

Saved in:

Bibliographic Details
Main Authors:	Deressa Wodajo Deressa, Hannes Mareen, Peter Lambert, Solomon Atnafu, Zahid Akhtar, Glenn Van Wallendael
Format:	Article
Language:	English
Published:	MDPI AG 2025-06-01
Series:	Applied Sciences
Subjects:	deep learning deepfakes deepfake detection vision transformer generative models
Online Access:	https://www.mdpi.com/2076-3417/15/12/6622
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Deepfakes have raised significant concerns due to their potential to spread false information and compromise the integrity of digital media. Current deepfake detection models often struggle to generalize across a diverse range of deepfake generation techniques and video content. In this work, we propose a Generative Convolutional Vision Transformer (GenConViT) for deepfake video detection. Our model combines ConvNeXt and Swin Transformer models for feature extraction, and it utilizes an Autoencoder and Variational Autoencoder to learn from latent data distributions. By learning from the visual artifacts and latent data distribution, GenConViT achieves an improved performance in detecting a wide range of deepfake videos. The model is trained and evaluated on DFDC, FF++, TM, DeepfakeTIMIT, and Celeb-DF (v2) datasets. The proposed GenConViT model demonstrates strong performance in deepfake video detection, achieving high accuracy across the tested datasets. While our model shows promising results in deepfake video detection by leveraging visual and latent features, we demonstrate that further work is needed to improve its generalizability when encountering out-of-distribution data. Our model provides an effective solution for identifying a wide range of fake videos while preserving the integrity of media.
ISSN:	2076-3417

GenConViT: Deepfake Video Detection Using Generative Convolutional Vision Transformer

Similar Items