In the context of Big Data, on the Scalability of Machine-Learning Algorithms for Breast Cancer Prediction

PROJECT TITLE :

On the Scalability of Machine-Learning Algorithms for Breast Cancer Prediction in Big Data Context

ABSTRACT:

Data has grown at an exponential rate as a result of recent developments in information technology, ushering in a new era of Big Data. Traditional machine-learning algorithms, unfortunately, are incapable of dealing with the novel characteristics of large data. In this study, we look into breast cancer prediction in the context of Big Data. We looked at two types of data: gene expression (GE) and DNA methylation (DNAM) (DM). The goal of this work is to utilize each dataset independently and together to scale up the machine-learning algorithms employed for categorization. We chose Apache Spark as the platform for this. In this research, we used three distinct classification methods to develop nine models that can predict breast cancer: support vector machine (SVM), decision tree, and random forest. In order to prove which of the three forms of data would yield the greatest outcome in terms of accuracy and error rate, we ran a complete comparative research using three scenarios using the GE, DM, and GE and DM combined. Furthermore, we conducted an experimental comparison of two platforms (Spark and Weka) in order to demonstrate their behavior while dealing with enormous data sets. The scaled SVM classifier in the Spark environment outperformed the other classifiers in terms of accuracy and error rate using the GE dataset, according to the testing results.

Did you like this research project?

To get this research project Guidelines, Training and Code... Click Here

In the context of Big Data, on the Scalability of Machine-Learning Algorithms for Breast Cancer Prediction

QUICK LINKS

Ready to Complete Your Academic MTech Project Work In Affordable Price ?