Shashank B - 2023 Poster Contest Resources

Shashank is a 2nd year student at RV College of Engineering, Bengaluru, currently studying Computer Science Engineering.

Poster Abstract

Introduction:

Multi-Node computation, also known as distributed computing, is a paradigm that allows for the efficient utilization of multiple interconnected nodes or machines to perform complex computational tasks. ECL is a powerful declarative data-processing language with native multi-processing capabilities. Some core ECL functions, like the ones handling Learning Tree algorithms of Machine Learning Bundle are recursive in nature and hence high computational time is needed. This project aims to visualize ML learning tree structure in ECL through embedding python libraries.

Objective:

The main objective of this project is to use HPCC Systems®, an open-source big data analytics platform, for creating a function for visualizing the Decision tree structure as an image for better understanding of the algorithm and its working.

Methodology:

This project focuses on leveraging HPCC Systems to create a function to visualize Decision Trees. The record sets which have to be processed using the Tree algorithms are sprayed onto the nodes of the system. The record sets are passed into the python library calls through the embed function. Using pandas, numpy and Scikit learn libraries the decision tree models are constructed. The model, once created, can be visualized and stored into an image, which is returned back to ECL. This data will be rendered using a customized observable Java script, using the <canvas> feature to draw pixel by pixel using the RGB values.

Presentation

In this Video Recording, Shashank provides a tour and explanation of his poster content.

Leveraging HPCC Systems for Visualization of Decision Trees Structure

Click on the poster for a larger image.

All pages in this wiki are subject to our site usage guidelines.