CD Skripsi
Pengelompokkan Perlakuan Kelapa Sawit Berdasarkan Ekspresi Gen Menggunakan K-Means Consensus Clustering
Increasing yields without expanding planting areas is a challenge in oil palm plantations, which can be achieved through the development of superior varieties. One approach is to build a predictive model based on gene expression data, which reflects important genetic activities in regulating physiological processes, such as growth and fruit formation. Currently, there are 388 oil palm plants whose gene expression data have been collected and stored in the National Center for Biotechnology Information (NCBI) database. However, until now there has been no study that specifically examines the relationship between these gene expression data. A deeper understanding of gene expression patterns in oil palm plants under various environmental conditions is very important to support efforts to develop superior varieties that are more adaptive and productive. This study analyzed oil palm transcriptome data using the K-means consensus clustering method with 20 iterations and the number of clusters from two to fifty. The evaluation results using the silhouette coefficient showed that the cluster 3 configuration was the most optimal, while the consensus matrix based on the delta K value showed cluster 8 as the most robust choice. The findings in each cluster reflect a distinctive gene expression pattern, where Cluster 1 is related to responses to environmental stress and regeneration, Cluster 2 reflects biological variations in pathogen damage and phenotypic changes, while Cluster 3 reflects a focus on the influence of abiotic stress (heat).
Keywords: K-means consensus clustering, Oil palm, transcriptome, silhouette coefficient.
Tidak tersedia versi lain