Parallel Processing Impact on Random Forest Classifier Performance: A CIFAR-10 Dataset Study

Bareen Haval Sadiq; Subhi R. M.  Zeebaree

doi:10.33022/ijcs.v13i2.3803

Authors

Bareen Haval Sadiq Duhok Polytechnic University
Subhi R. M. Zeebaree Energy Eng. Dept., Technical College of Engineering, Duhok Polytechnic University, Duhok, Iraq

DOI:

https://doi.org/10.33022/ijcs.v13i2.3803

Keywords:

Parallel Processing, Distributed Systems, Cloud Computing, Machine Learning, Random Forest

Abstract

Using the CIFAR-10 dataset, this research investigates how parallel processing affects the Random Forest method's machine learning performance. Accuracy and training time are highlighted in the study as critical performance indicators. Two cases were studied, one with and one without parallel processing. The results show the strong prediction powers of the Random Forest algorithm, which continues to analyze data in parallel while retaining a high accuracy of 97.50%. In addition, training times are notably shortened by parallelization, going from 0.6187 to 0.4753 seconds. The noted increase in time efficiency highlights the importance of parallelization in carrying out activities simultaneously, which enhances the training process's computational efficiency. These results provide important new information about how to optimize machine learning algorithms using parallel processing approaches.