Mitigating Algorithmic Bias Through Probability Calibration: A Case Study on Lead Generation Data
Probability calibration is commonly utilized to enhance the reliability and interpretability of probabilistic classifiers, yet its potential for reducing algorithmic bias remains under-explored. In this study, the role of probability calibration techniques in mitigating bias associated with sensitiv...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-07-01
|
| Series: | Mathematics |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2227-7390/13/13/2183 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1839631697572790272 |
|---|---|
| author | Miroslav Nikolić Danilo Nikolić Miroslav Stefanović Sara Koprivica Darko Stefanović |
| author_facet | Miroslav Nikolić Danilo Nikolić Miroslav Stefanović Sara Koprivica Darko Stefanović |
| author_sort | Miroslav Nikolić |
| collection | DOAJ |
| description | Probability calibration is commonly utilized to enhance the reliability and interpretability of probabilistic classifiers, yet its potential for reducing algorithmic bias remains under-explored. In this study, the role of probability calibration techniques in mitigating bias associated with sensitive attributes, specifically country of origin, within binary classification models is investigated. Using a real-world lead-generation 2853 × 8 matrix dataset characterized by substantial class imbalance, with the positive class representing 1.4% of observations, several binary classification models were evaluated and the best-performing model was selected as the baseline for further analysis. The evaluated models included Binary Logistic Regression with polynomial degrees of 1, 2, 3, and 4, Random Forest, and XGBoost classification algorithms. Three widely used calibration methods, Platt scaling, isotonic regression, and temperature scaling, were then used to assess their impact on both probabilistic accuracy and fairness metrics of the best-performing model. The findings suggest that post hoc calibration can effectively reduce the influence of sensitive features on predictions by improving fairness without compromising overall classification performance. This study demonstrates the practical value of incorporating calibration as a straightforward and effective fairness intervention within machine learning workflows. |
| format | Article |
| id | doaj-art-d4082d76c57442299e87b520d752e8b4 |
| institution | Matheson Library |
| issn | 2227-7390 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Mathematics |
| spelling | doaj-art-d4082d76c57442299e87b520d752e8b42025-07-11T14:40:41ZengMDPI AGMathematics2227-73902025-07-011313218310.3390/math13132183Mitigating Algorithmic Bias Through Probability Calibration: A Case Study on Lead Generation DataMiroslav Nikolić0Danilo Nikolić1Miroslav Stefanović2Sara Koprivica3Darko Stefanović4Open Institute of Technology, University of Malta, XBX 1425 Ta’ Xbiex, MaltaFaculty of Technical Sciences, University of Novi Sad, 21000 Novi Sad, SerbiaFaculty of Technical Sciences, University of Novi Sad, 21000 Novi Sad, SerbiaFaculty of Technical Sciences, University of Novi Sad, 21000 Novi Sad, SerbiaFaculty of Technical Sciences, University of Novi Sad, 21000 Novi Sad, SerbiaProbability calibration is commonly utilized to enhance the reliability and interpretability of probabilistic classifiers, yet its potential for reducing algorithmic bias remains under-explored. In this study, the role of probability calibration techniques in mitigating bias associated with sensitive attributes, specifically country of origin, within binary classification models is investigated. Using a real-world lead-generation 2853 × 8 matrix dataset characterized by substantial class imbalance, with the positive class representing 1.4% of observations, several binary classification models were evaluated and the best-performing model was selected as the baseline for further analysis. The evaluated models included Binary Logistic Regression with polynomial degrees of 1, 2, 3, and 4, Random Forest, and XGBoost classification algorithms. Three widely used calibration methods, Platt scaling, isotonic regression, and temperature scaling, were then used to assess their impact on both probabilistic accuracy and fairness metrics of the best-performing model. The findings suggest that post hoc calibration can effectively reduce the influence of sensitive features on predictions by improving fairness without compromising overall classification performance. This study demonstrates the practical value of incorporating calibration as a straightforward and effective fairness intervention within machine learning workflows.https://www.mdpi.com/2227-7390/13/13/2183probability calibrationalgorithmic fairnessisotonic regressionexpected calibration errormachine learning fairnessbinary classification |
| spellingShingle | Miroslav Nikolić Danilo Nikolić Miroslav Stefanović Sara Koprivica Darko Stefanović Mitigating Algorithmic Bias Through Probability Calibration: A Case Study on Lead Generation Data Mathematics probability calibration algorithmic fairness isotonic regression expected calibration error machine learning fairness binary classification |
| title | Mitigating Algorithmic Bias Through Probability Calibration: A Case Study on Lead Generation Data |
| title_full | Mitigating Algorithmic Bias Through Probability Calibration: A Case Study on Lead Generation Data |
| title_fullStr | Mitigating Algorithmic Bias Through Probability Calibration: A Case Study on Lead Generation Data |
| title_full_unstemmed | Mitigating Algorithmic Bias Through Probability Calibration: A Case Study on Lead Generation Data |
| title_short | Mitigating Algorithmic Bias Through Probability Calibration: A Case Study on Lead Generation Data |
| title_sort | mitigating algorithmic bias through probability calibration a case study on lead generation data |
| topic | probability calibration algorithmic fairness isotonic regression expected calibration error machine learning fairness binary classification |
| url | https://www.mdpi.com/2227-7390/13/13/2183 |
| work_keys_str_mv | AT miroslavnikolic mitigatingalgorithmicbiasthroughprobabilitycalibrationacasestudyonleadgenerationdata AT danilonikolic mitigatingalgorithmicbiasthroughprobabilitycalibrationacasestudyonleadgenerationdata AT miroslavstefanovic mitigatingalgorithmicbiasthroughprobabilitycalibrationacasestudyonleadgenerationdata AT sarakoprivica mitigatingalgorithmicbiasthroughprobabilitycalibrationacasestudyonleadgenerationdata AT darkostefanovic mitigatingalgorithmicbiasthroughprobabilitycalibrationacasestudyonleadgenerationdata |