Mitigating Algorithmic Bias Through Probability Calibration: A Case Study on Lead Generation Data

Probability calibration is commonly utilized to enhance the reliability and interpretability of probabilistic classifiers, yet its potential for reducing algorithmic bias remains under-explored. In this study, the role of probability calibration techniques in mitigating bias associated with sensitiv...

Full description

Saved in:
Bibliographic Details
Main Authors: Miroslav Nikolić, Danilo Nikolić, Miroslav Stefanović, Sara Koprivica, Darko Stefanović
Format: Article
Language:English
Published: MDPI AG 2025-07-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/13/13/2183
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1839631697572790272
author Miroslav Nikolić
Danilo Nikolić
Miroslav Stefanović
Sara Koprivica
Darko Stefanović
author_facet Miroslav Nikolić
Danilo Nikolić
Miroslav Stefanović
Sara Koprivica
Darko Stefanović
author_sort Miroslav Nikolić
collection DOAJ
description Probability calibration is commonly utilized to enhance the reliability and interpretability of probabilistic classifiers, yet its potential for reducing algorithmic bias remains under-explored. In this study, the role of probability calibration techniques in mitigating bias associated with sensitive attributes, specifically country of origin, within binary classification models is investigated. Using a real-world lead-generation 2853 × 8 matrix dataset characterized by substantial class imbalance, with the positive class representing 1.4% of observations, several binary classification models were evaluated and the best-performing model was selected as the baseline for further analysis. The evaluated models included Binary Logistic Regression with polynomial degrees of 1, 2, 3, and 4, Random Forest, and XGBoost classification algorithms. Three widely used calibration methods, Platt scaling, isotonic regression, and temperature scaling, were then used to assess their impact on both probabilistic accuracy and fairness metrics of the best-performing model. The findings suggest that post hoc calibration can effectively reduce the influence of sensitive features on predictions by improving fairness without compromising overall classification performance. This study demonstrates the practical value of incorporating calibration as a straightforward and effective fairness intervention within machine learning workflows.
format Article
id doaj-art-d4082d76c57442299e87b520d752e8b4
institution Matheson Library
issn 2227-7390
language English
publishDate 2025-07-01
publisher MDPI AG
record_format Article
series Mathematics
spelling doaj-art-d4082d76c57442299e87b520d752e8b42025-07-11T14:40:41ZengMDPI AGMathematics2227-73902025-07-011313218310.3390/math13132183Mitigating Algorithmic Bias Through Probability Calibration: A Case Study on Lead Generation DataMiroslav Nikolić0Danilo Nikolić1Miroslav Stefanović2Sara Koprivica3Darko Stefanović4Open Institute of Technology, University of Malta, XBX 1425 Ta’ Xbiex, MaltaFaculty of Technical Sciences, University of Novi Sad, 21000 Novi Sad, SerbiaFaculty of Technical Sciences, University of Novi Sad, 21000 Novi Sad, SerbiaFaculty of Technical Sciences, University of Novi Sad, 21000 Novi Sad, SerbiaFaculty of Technical Sciences, University of Novi Sad, 21000 Novi Sad, SerbiaProbability calibration is commonly utilized to enhance the reliability and interpretability of probabilistic classifiers, yet its potential for reducing algorithmic bias remains under-explored. In this study, the role of probability calibration techniques in mitigating bias associated with sensitive attributes, specifically country of origin, within binary classification models is investigated. Using a real-world lead-generation 2853 × 8 matrix dataset characterized by substantial class imbalance, with the positive class representing 1.4% of observations, several binary classification models were evaluated and the best-performing model was selected as the baseline for further analysis. The evaluated models included Binary Logistic Regression with polynomial degrees of 1, 2, 3, and 4, Random Forest, and XGBoost classification algorithms. Three widely used calibration methods, Platt scaling, isotonic regression, and temperature scaling, were then used to assess their impact on both probabilistic accuracy and fairness metrics of the best-performing model. The findings suggest that post hoc calibration can effectively reduce the influence of sensitive features on predictions by improving fairness without compromising overall classification performance. This study demonstrates the practical value of incorporating calibration as a straightforward and effective fairness intervention within machine learning workflows.https://www.mdpi.com/2227-7390/13/13/2183probability calibrationalgorithmic fairnessisotonic regressionexpected calibration errormachine learning fairnessbinary classification
spellingShingle Miroslav Nikolić
Danilo Nikolić
Miroslav Stefanović
Sara Koprivica
Darko Stefanović
Mitigating Algorithmic Bias Through Probability Calibration: A Case Study on Lead Generation Data
Mathematics
probability calibration
algorithmic fairness
isotonic regression
expected calibration error
machine learning fairness
binary classification
title Mitigating Algorithmic Bias Through Probability Calibration: A Case Study on Lead Generation Data
title_full Mitigating Algorithmic Bias Through Probability Calibration: A Case Study on Lead Generation Data
title_fullStr Mitigating Algorithmic Bias Through Probability Calibration: A Case Study on Lead Generation Data
title_full_unstemmed Mitigating Algorithmic Bias Through Probability Calibration: A Case Study on Lead Generation Data
title_short Mitigating Algorithmic Bias Through Probability Calibration: A Case Study on Lead Generation Data
title_sort mitigating algorithmic bias through probability calibration a case study on lead generation data
topic probability calibration
algorithmic fairness
isotonic regression
expected calibration error
machine learning fairness
binary classification
url https://www.mdpi.com/2227-7390/13/13/2183
work_keys_str_mv AT miroslavnikolic mitigatingalgorithmicbiasthroughprobabilitycalibrationacasestudyonleadgenerationdata
AT danilonikolic mitigatingalgorithmicbiasthroughprobabilitycalibrationacasestudyonleadgenerationdata
AT miroslavstefanovic mitigatingalgorithmicbiasthroughprobabilitycalibrationacasestudyonleadgenerationdata
AT sarakoprivica mitigatingalgorithmicbiasthroughprobabilitycalibrationacasestudyonleadgenerationdata
AT darkostefanovic mitigatingalgorithmicbiasthroughprobabilitycalibrationacasestudyonleadgenerationdata