Knowledge distillation for spiking neural networks: aligning features and saliency

Spiking neural networks (SNNs) are renowned for their energy efficiency and bio-fidelity, but their widespread adoption is hindered by challenges in training, primarily due to the non-differentiability of spiking activations and limited representational capacity. Existing approaches, such as artific...

Full description

Saved in:

Bibliographic Details
Main Authors:	Yifan Hu, Guoqi Li, Lei Deng
Format:	Article
Language:	English
Published:	IOP Publishing 2025-01-01
Series:	Neuromorphic Computing and Engineering
Subjects:	spiking neural networks knowledge distillation saliency map neuromorphic computing model compression
Online Access:	https://doi.org/10.1088/2634-4386/ade821
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Spiking neural networks (SNNs) are renowned for their energy efficiency and bio-fidelity, but their widespread adoption is hindered by challenges in training, primarily due to the non-differentiability of spiking activations and limited representational capacity. Existing approaches, such as artificial neural network (ANN)-to-SNN conversion and surrogate gradient learning, either suffer from prolonged simulation times or suboptimal performance. To address these challenges, we provide a novel perspective that frames knowledge distillation as a hybrid training strategy, effectively combining knowledge transfer from pretrained models with spike-based gradient learning. This approach leverages the complementary benefits of both paradigms, enabling the development of high-performance, low-latency SNNs. Our approach features a lightweight affine projector that facilitates flexible representation alignment across diverse network architectures and neuron types. We further empirically demonstrate that the effectiveness of distillation is robust, irrespective of whether high-precision membrane potentials or binary spike trains are used as features. Through a quantitative measure of the consistency between model predictions and the saliency of relevant input pixels, we show that knowledge transfer is grounded in a shared understanding of salient features, rather than the exact replication of numerical activations. This framework represents a significant step towards enabling SNNs to achieve accuracy levels that are competitive with those of their ANN counterparts, while maintaining a minimal number of timesteps. For instance, applying our method to ResNet-18 on CIFAR-100 attains 80.48% accuracy with just four timesteps, surpassing the equivalent ANN (79.90%) and yielding a 3.49% improvement over non-distilled SNNs.
ISSN:	2634-4386

Knowledge distillation for spiking neural networks: aligning features and saliency

Similar Items