Pupil Detection Algorithm Based on ViM
Pupil detection is a key technology in fields such as human–computer interaction, fatigue driving detection, and medical diagnosis. Existing pupil detection algorithms still face challenges in maintaining robustness under variable lighting conditions and occlusion scenarios. In this paper, we propos...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2025-06-01
|
Series: | Sensors |
Subjects: | |
Online Access: | https://www.mdpi.com/1424-8220/25/13/3978 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Pupil detection is a key technology in fields such as human–computer interaction, fatigue driving detection, and medical diagnosis. Existing pupil detection algorithms still face challenges in maintaining robustness under variable lighting conditions and occlusion scenarios. In this paper, we propose a novel pupil detection algorithm, ViMSA, based on the ViM model. This algorithm introduces weighted feature fusion, aiming to enable the model to adaptively learn the contribution of different feature patches to the pupil detection results; combines ViM with the MSA (multi-head self-attention) mechanism), aiming to integrate global features and improve the accuracy and robustness of pupil detection; and uses FFT (Fast Fourier Transform) to convert the time-domain vector outer product in MSA into a frequency–domain dot product, in order to reduce the computational complexity of the model and improve the detection efficiency of the model. ViMSA was trained and tested on nearly 135,000 pupil images from 30 different datasets, demonstrating exceptional generalization capability. The experimental results demonstrate that the proposed ViMSA achieves 99.6% detection accuracy at five pixels with an RMSE of 1.67 pixels and a processing speed exceeding 100 FPS, meeting real-time monitoring requirements for various applications including operation under variable and uneven lighting conditions, assistive technology (enabling communication with neuro-motor disorder patients through pupil recognition), computer gaming, and automotive industry applications (enhancing traffic safety by monitoring drivers’ cognitive states). |
---|---|
ISSN: | 1424-8220 |