Herman, Dinnar Nurhuda and Awaliyah Zuraiyah, Tjut and Tus Sadiah, Halimah (2024) COMPARISON OF SUPPORT VECTOR MACHINE AND RANDOM FOREST ALGORITHMS FOR DIABETES DISEASE PREDICTION. Skripsi thesis, Universitas Pakuan.
Text
SKRIPSI.docx Download (6MB) |
Abstract
COMPARISON OF SUPPORT VECTOR MACHINE AND RANDOM FOREST ALGORITHMS FOR DIABETES DISEASE PREDICTION Tjut Awaliyah Zuraiyah1 , Halimah Tussa’diah2 , Dinnar Nurhuda Hermawan3 1,2,3 Department of Computer Science, Faculty of Mathematics and Natural Science, Pakuan University, Bogor, West Java, 16143, Indonesia Abstract Diabetes mellitus is one of the most widespread and persistent diseases affecting humans worldwide. Around 425 million people have suffered globally today and it is estimated that up to 700 million people will be affected by 2045. Before diagnosing this disease, doctors must analyze several factors, making the doctor's work inefficient. However, technology can be used to make predictions or detect diabetes, according to current advances. This technological advancement can help make it easier for doctors to treat the disease. In the medical field, classification methods can be used to classify the severity of a patient's disease. The purpose of this study was to determine the accuracy of the support vector machine and random forest algorithm classification on the dataset of diabetes patients with 4 scenarios of dividing train data and test data, then a web application was created. The benefits of this study are to provide a model that can be used by health workers and the public for the first screening tool to identify patients who may have diabetes. The research method used is CRISP-DM with the stages of business understanding, data understanding, data preparation, modeling, evaluation, and deployment. The results of this study prove that the random forest algorithm is able to predict diabetes with high performance in various data sharing scenarios compared to the support vector machine algorithm. The developed diabetes prediction model has the highest performance in data splitting 70:30, 75:25, and 90:10 with the following values: accuracy of 96%, recall of 96%, precision of 94%, and f1-score of 95%. Keywords: prediction; diabetes; CRISP-DM; random forests; support vector machines
Item Type: | Thesis (Skripsi) |
---|---|
Subjects: | Fakultas Ilmu Pengetahuan Alam dan Matematika > Ilmu Komputer |
Divisions: | Fakultas Matematika dan Ilmu Pengetahuan Alam > Ilmu Komputer |
Depositing User: | PERPUSTAKAAN FAKULTAS MATEMATIKA DAN ILMU PENGETAHUAN ALAM UNPAK |
Date Deposited: | 28 Dec 2024 02:17 |
Last Modified: | 28 Dec 2024 02:17 |
URI: | http://eprints.unpak.ac.id/id/eprint/8777 |
Actions (login required)
View Item |