Abstract
Abstract— Fairness issues in medical AI can lead to uneven diagnostic
performance across demographic groups. In MRI-based dementia
detection, skewed class and gender distributions often degrade equity.
Using an MRI dataset of 150 adults (60–96 years) labelled as
demented, non-demented or converted, we trained Logistic
Regression, SVM, Random Forest and Gradient Boosting models
under three data regimes: Original (imbalanced), SMOTE and
Borderline-SMOTE. We standardised features, performed stratified
70/30 train–test splits and tuned models with grid search and five fold
cross validation. Performance was evaluated via Accuracy, Precision,
Recall, F1-score and MCC, while fairness was assessed by reporting
metrics separately for male and female subgroups under an equal
opportunity notion focused on subgroup F1-scores. Performance
improved monotonically from Original to SMOTE to Borderline-
SMOTE. Ensemble methods showed the strongest gains after
Borderline-SMOTE (MCC > 0.85). The male–female F1-score gap
narrowed from about 7 percent (Original) to about 4 percent (SMOTE)
and to under 3 percent (Borderline-SMOTE), indicating progressive
bias mitigation while maintaining accuracy. Borderline-SMOTE, by
focusing synthesis near decision boundaries, outperformed standard
SMOTE, improving stability and reducing gender disparity. Coupling
targeted resampling with ensemble learning is a practical route to fairer
dementia classifiers without sacrificing predictive power.
performance across demographic groups. In MRI-based dementia
detection, skewed class and gender distributions often degrade equity.
Using an MRI dataset of 150 adults (60–96 years) labelled as
demented, non-demented or converted, we trained Logistic
Regression, SVM, Random Forest and Gradient Boosting models
under three data regimes: Original (imbalanced), SMOTE and
Borderline-SMOTE. We standardised features, performed stratified
70/30 train–test splits and tuned models with grid search and five fold
cross validation. Performance was evaluated via Accuracy, Precision,
Recall, F1-score and MCC, while fairness was assessed by reporting
metrics separately for male and female subgroups under an equal
opportunity notion focused on subgroup F1-scores. Performance
improved monotonically from Original to SMOTE to Borderline-
SMOTE. Ensemble methods showed the strongest gains after
Borderline-SMOTE (MCC > 0.85). The male–female F1-score gap
narrowed from about 7 percent (Original) to about 4 percent (SMOTE)
and to under 3 percent (Borderline-SMOTE), indicating progressive
bias mitigation while maintaining accuracy. Borderline-SMOTE, by
focusing synthesis near decision boundaries, outperformed standard
SMOTE, improving stability and reducing gender disparity. Coupling
targeted resampling with ensemble learning is a practical route to fairer
dementia classifiers without sacrificing predictive power.
| Original language | English |
|---|---|
| Number of pages | 7 |
| Journal | International Journal of Computer Science and Information Security (IJCSIS) |
| Volume | 23 |
| Issue number | 6 |
| DOIs | |
| Publication status | Published - 30 Dec 2025 |
Fingerprint
Dive into the research topics of 'MITIGATING GENDER BIAS AND FAIRNESS IN DEMENTIA CLASSIFICATION USING BORDERLINE SMOTE AND ENSEMBLE MACHINE LEARNING MODELS'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver