On the Synergy of Optimizers and Activation Functions: A CNN Benchmarking Study

In this study, we present a comparative analysis of gradient descent-based optimizers frequently used in Convolutional Neural Networks (CNNs), including SGD, mSGD, RMSprop, Adadelta, Nadam, Adamax, Adam, and the recent EVE optimizer. To explore the interaction between optimization strategies and act...

Full description

Saved in:

Bibliographic Details
Main Authors:	Khuraman Aziz Sayın, Necla Kırcalı Gürsoy, Türkay Yolcu, Arif Gürsoy
Format:	Article
Language:	English
Published:	MDPI AG 2025-06-01
Series:	Mathematics
Subjects:	Stochastic Gradient Descent optimization convolutional neural network activation functions image processing
Online Access:	https://www.mdpi.com/2227-7390/13/13/2088
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	In this study, we present a comparative analysis of gradient descent-based optimizers frequently used in Convolutional Neural Networks (CNNs), including SGD, mSGD, RMSprop, Adadelta, Nadam, Adamax, Adam, and the recent EVE optimizer. To explore the interaction between optimization strategies and activation functions, we systematically evaluate all combinations of these optimizers with four activation functions—ReLU, LeakyReLU, Tanh, and GELU—across three benchmark image classification datasets: CIFAR-10, Fashion-MNIST (F-MNIST), and Labeled Faces in the Wild (LFW). Each configuration was assessed using multiple evaluation metrics, including accuracy, precision, recall, F1-score, mean absolute error (MAE), and mean squared error (MSE). All experiments were performed using <i>k</i>-fold cross-validation to ensure statistical robustness. Additionally, two-way ANOVA was employed to validate the significance of differences across optimizer–activation combinations. This study aims to highlight the importance of jointly selecting optimizers and activation functions to enhance training dynamics and generalization in CNNs. We also consider the role of critical hyperparameters, such as learning rate and regularization methods, in influencing optimization stability. This work provides valuable insights into the optimizer–activation interplay and offers practical guidance for improving architectural and hyperparameter configurations in CNN-based deep learning models.
ISSN:	2227-7390

On the Synergy of Optimizers and Activation Functions: A CNN Benchmarking Study

Similar Items