Smoothed per-tensor weight quantization: a robust solution for neural network deployment
This paper introduces a novel method to improve quantization outcomes for per-tensor weight quantization, focusing on enhancing computational efficiency and compatibility with resource-constrained hardware. Addressing the inherent challenges of depth-wise convolutions, the proposed smooth quantizati...
Saved in:
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
Polish Academy of Sciences
2025-07-01
|
Series: | International Journal of Electronics and Telecommunications |
Subjects: | |
Online Access: | https://journals.pan.pl/Content/135755/23_4966_Chang_L_sk.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | This paper introduces a novel method to improve quantization outcomes for per-tensor weight quantization, focusing on enhancing computational efficiency and compatibility with resource-constrained hardware. Addressing the inherent challenges of depth-wise convolutions, the proposed smooth quantization technique redistributes weight magnitude disparities to pre-activation data, thereby equalizing channel-wise weight magnitudes. This adjustment enables more effective application of uniform quantization schemes. Experimental evaluations on the ImageNet classification benchmark demonstrate substantial performance gains across modern architectures and training strategies. The proposed method achieves improved accuracy to per-tensor quantization without noticeable computational overhead, making it a practical solution for edge-device deployments. |
---|---|
ISSN: | 2081-8491 2300-1933 |