EarlyExodus: Leveraging early exits to mitigate backdoor vulnerability in deep learning

The rapid migration of artificial-intelligence workloads toward edge computing significantly enhances capabilities in critical applications such as autonomous vehicles, augmented and virtual reality, and e-health, but it also heightens the urgency for robust security. However, this urgency reveals a...

Full description

Saved in:
Bibliographic Details
Main Authors: Salmane Douch, M. Riduan Abid, Khalid Zine-Dine, Driss Bouzidi, Fatima Ezzahra El Aidos, Driss Benhaddou
Format: Article
Language:English
Published: Elsevier 2025-09-01
Series:Results in Engineering
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2590123025024740
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The rapid migration of artificial-intelligence workloads toward edge computing significantly enhances capabilities in critical applications such as autonomous vehicles, augmented and virtual reality, and e-health, but it also heightens the urgency for robust security. However, this urgency reveals a critical gap: state-of-the-art backdoor defenses remain vulnerable to sophisticated data-poisoning attacks that subtly embed malicious triggers into training data and covertly manipulate model predictions, threatening the reliability and trustworthiness of edge-deployed AI.To counter this threat, we propose a defense mechanism that neutralizes advanced data poisoning attacks, clearly identifies maliciously targeted labels, and preserves model accuracy and integrity across diverse architectures and datasets. Our technique, EarlyExodus, integrates early-exit branches within neural networks and trains them with a divergence objective so that, for poisoned inputs, the early exit exposes the malicious label while the final exit maintains the correct classification. Extensive experiments on LeNet-5, ResNet-32, and GhostNet across MNIST, CIFAR-10, and GTSRB reduce the average attack success rate of seven recent backdoor attacks to about 3%, with clean-data accuracy degradation kept below 2%. These results demonstrate a practical, architecture-agnostic pathway toward trustworthy edge-AI systems and lay the foundation for extending backdoor defenses beyond image models to broader application domains.
ISSN:2590-1230