Enhanced Gold Ore Classification: A Comparative Analysis of Machine Learning Techniques with Textural and Chemical Data
Specific computational methods, such as machine learning algorithms, can assist mining professionals in quickly and consistently identifying and addressing classification issues related to mineralized horizons, as well as uncovering key variables that impact predictive outcomes, many of which were p...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2025-07-01
|
Series: | Geosciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3263/15/7/248 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Specific computational methods, such as machine learning algorithms, can assist mining professionals in quickly and consistently identifying and addressing classification issues related to mineralized horizons, as well as uncovering key variables that impact predictive outcomes, many of which were previously difficult to observe. The integration of numerical and categorical variables, which are part of a dataset for defining ore grades, is part of the daily routine of professionals who obtain the data and manipulate the various phases of analysis in a mining project. Several supervised and unsupervised machine learning methods and applications integrate a wide variety of algorithms that aim at the efficient recognition of patterns and similarities and the ability to make accurate and assertive decisions. The objective of this study is the classification of gold ore or gangue through supervised machine learning methods using numerical variables represented by grade and categorical variables obtained through drillholes descriptions. Four groups of variables were selected with different variable configurations. The application of classification algorithms to different groups of variables aimed to observe the variables of importance and the impact of each one on the classification, in addition to testing the best algorithm in terms of accuracy and precision. The datasets were subjected to training, validation, and testing using the decision tree, random forest, Adaboost, XGBoost, and logistic regression methods. The evaluation was randomly divided into training (60%) and testing (40%) with 10-fold cross-validation. The results revealed that the XGBoost algorithm obtained the best performance, with an accuracy of 0.96 for scenario C1. In the SHAP analysis, the variable As was prominent in the predictions, mainly in scenarios C1 and C3. The arsenic class (Class_As), present mainly in scenario C4, had a significant positive weight in the classification. In the Receiver Operating Characteristic (ROC) and Area Under the Curve (AUC) curves, the results showed that XGBoost/scenario C1 obtained the highest AUC of 0.985, indicating that the algorithm had the best performance in ore/gangue classification of the sample set. The logistic regression algorithm together with AdaBoost had the worst performance, also varying between scenarios. |
---|---|
ISSN: | 2076-3263 |