Chirag Bapat is a student at the RV College of Engineering, Bengaluru, India. |
Poster Abstract
In order to constantly evolve and generate better results from any system, we require constant studies to be conducted to assess and compare the performance of new and upcoming systems with the current industry standards. Through our project, we intend to perform a similar comprehensive comparative study between the current standard in Big Data Analytics systems - Hadoop, and that provided by HPCC Systems. This will allow us to assess both the similarities and differences between the two setups, which in turn will assist the end user or the client to make a better and more informed choice about the kind of system to be set up for their specific requirements.
Through the "Comparative study of HPCC Systems and Hadoop", we plan to prepare each solution from scratch, and analyse various parameters not just limited to technical performance, but overall user experience as well. This would include the ease and time to set up the environments and other similar factors. Through our presentation, we aim to compare the following parameters:
Ease of access of material regarding the concerned software
Time required to set up clusters
Ease of programming in the respective languages for each system using
their programming languages
Running various machine learning algorithms on each system with different
sized datasets, and measuring their accuracies and execution times and contrasting the LoCs required to implement the same
During the implementation of the machine learning algorithms, we shall be working on 2 different sized datasets: the smaller USA Cars dataset provided by HPCC Systems and a large UK Housing dataset sourced from Kaggle. This would compare the load performance of each system.
Presentation
In this Video Recording, Chirag provides a tour and explanation of his poster content.
Comparative Study Between HPCC Systems and Hadoop
Click on the poster for a larger image.