CD Skripsi
Evaluasi Teknik Adaptive Synthetic Sampling (Adasyn) Untuk Klasifikasi Rumah Tangga Miskin Provinsi Riau Dengan Regresi Logistik
Challenges in classification modeling often arise due to imbalanced data distribution. Imbalance occurs when there one class has more than the other. Therefore, it needs to be overcome by the resampling method, namely by applying adaptive synthetic sampling (ADASYN). To evaluate the effectiveness of ADASYN, research will be conducted with logistic regression classification for Riau Province household data in 2024, which is sourced from the Central Bureau of Statistics (Badan Pusat Statistik) statistical service information system. Household categories are divided into poor and non-poor households based on the poverty line. The analysis results obtained a logistic regression model without resampling has 91% accuracy, 100% specificity, 0% recall, and 0% g-mean. Meanwhile, the logistic regression model with ADASYN resampling has 70% accuracy, 73% specificity, 45% recall, and 57% g-mean. The increase in the g-mean and recall value indicates that the application of ADASYN resampling is effective in improving the balance of classification on imbalanced data. Nevertheless, there is a decrease in accuracy value from 91% to 70% (a reduction of 21%).
Keywords: Adaptive synthetic sampling (ADASYN), imbalanced data, logistic regression, poor household
Tidak tersedia versi lain