The paper introduces a novel probability-based cluster expansion oversampling technique aimed at addressing class imbalance issues in data mining, particularly within datasets where one class is underrepresented. This method enhances classifier performance by concurrently addressing both between-class and within-class imbalances through model-based clustering and the computation of separating hyperplanes, validated across 10 public datasets. The proposed technique is shown to statistically outperform existing oversampling methods, thereby improving the generalization ability of classifiers.
Related topics: