Optimizing random forests: spark implementations of random genetic forests

Sikha Bagui; Timothy Bennett

doi:10.54646/bije.2022.09

PDF EPUB HTML XML

Abstract Views: 62

PDF Views/Downloads: 46

EPUB Views/Downloads: 25

HTML Views/Downloads: 21

XML Views/Downloads: 2

How to Cite

Bagui, S., & Bennett, T. (2022). Optimizing random forests: spark implementations of random genetic forests. BOHR International Journal of Engineering, 1(1), 42–51. https://doi.org/10.54646/bije.2022.09

Published: Oct 15, 2022

DOI: https://doi.org/10.54646/bije.2022.09

Dimensions Citation count:

Keywords:

random forest
genetic algorithm
random genetic forest
bagging, big data
Hadoop
spark
machine learning
network intrusion data

Authors

Sikha Bagui

Department of Computer Science, University of West Florida, Pensacola, FL, United States

Timothy Bennett

Department of Computer Science, University of West Florida, Pensacola, FL, United States

Abstract

The Random Forest (RF) algorithm, originally proposed by Breiman et al. (1), is a widely used machine learning
algorithm that gains its merit from its fast learning speed as well as high classification accuracy. However, despite
its widespread use, the different mechanisms at work in Breiman’s RF are not yet fully understood, and there is still
on-going research on several aspects of optimizing the RF algorithm, especially in the big data environment. To
optimize the RF algorithm, this work builds new ensembles that optimize the random portions of the RF algorithm
using genetic algorithms, yielding Random Genetic Forests (RGF), Negatively Correlated RGF (NC-RGF), and
Preemptive RGF (PFS-RGF). These ensembles are compared with Breiman’s classic RF algorithm in Hadoop’s
big data framework using Spark on a large, high-dimensional network intrusion dataset, UNSW-NB15.

Share This Article On Social Media

Usage Statistics

Downloads

Download data is not yet available.

Issue

Vol. 1 No. 1 (2022): BOHR International Journal of Engineering (BIJE)

Section

Original Research

Article Sidebar

Main Article Content

Authors

Abstract

Downloads

Article Details