CD Skripsi
Perbandingan Kinerja Metode Smote Dan Random Oversampling Untuk Mengatasi Imbalanced Data Dalam Klasifikasi Status Kesejahteraan Rumah Tangga Di Pekanbaru 2020
ABSTRACT Classification analysis can be applied in many cases, one of which is in the classification of household welfare status in poor and non-poor categories. If the data analyzed contains imbalanced data, it is necessary to solve the imbalanced data to avoid misclassification. If a classification is carried out, households that are actually poor tend to be classified into non-poor households. This study aims to overcome imbalanced data with Synthetic Minority Oversampling Technique (SMOTE) and Random Oversampling (ROS) methods. The balanced data will then be analyzed in classification using K-Nearest Neighbor (KNN) and Binary Logistic Regression and make a comparison of SMOTE and ROS performance. This study uses 18 variables from BPS zero rupiah tariff data from the results of the National Socioeconomic Survey based on household welfare status and the variables that influence it. Based on the results of the analysis, it is known that the ROS method provides better performance than SMOTE in handling imbalanced data for the classification of household welfare status. The results of accuracy, sensitivity, specificity, G-mean, and AUC in KNN were 93.08%, 66.67%, 93.59%, 79%, and 0.801, while in binary logistic regression, it was 90.57%, 66.67%, 91.02%, 78% and 0.83. Keywords: Imbalanced Data, SMOTE, ROS, Classification.
Tidak tersedia versi lain