Goutami Sooda - 2024 Poster Contest Resources
Goutami Sooda is a second-year Computer Science student at RV College of Engineering, Bangalore, with a profound interest in Machine Learning, Computer Vision, Natural Language Processing, and Software Development. Currently, she is working as a Prism Intern at Samsung R&D Institute. She has demonstrated her technical acumen through various projects. She contributed to the ITRL Project, developing a tool that utilizes NLP techniques to translate Kannada algorithms into Python code, aimed at aiding rural students. She honed her skills in image processing and machine learning through the Rice Grain Analysis project, where she devised methods to assess rice quality parameters. Additionally, she contributed to the Regression Test Engine of HPCC Systems by integrating a cleanup feature. Beyond her technical pursuits, she is actively volunteering as Secretary of the IEEE-RVCE SIGHT Affinity Group for 2024-2025. She is passionate about learning new technologies and making meaningful contributions to society. |
Poster Abstract
Introduction: -
The HPCC Systems employs a regression testing module aimed at validating new code additions and ensuring their seamless integration into the existing codebase. The Regression Test Engine (RTE) executes a myriad of test cases, resulting in the generation of a substantial number of workunits within the HPCC Systems cluster. In localized development environments or Cloud-based platforms like AWS and Azure, the accumulation of these workunits poses a potential challenge, consuming significant memory space. This situation not only translates to increased operational costs but also raises resource utilization concerns. To address this challenge, there is a compelling need to introduce an automated cleanup mechanism for workunits. The primary goal is to prevent the inadvertent overload of both the cluster and the Dali component, ensuring efficient resource management and cost-effectiveness.
Objective: -
Design and develop a cleanup module to perform deletion of workunits generated by regression test engine.
Design and develop custom logging system for cleanup module to log cleanup information.
Thoroughly test the working of cleanup module in both local and Cloud environments.
Integrate the code changes with original HPCC Platform repository along with documentation of the cleanup module designed.
Methodology: -
Familiarization with Regression Test Engine - Understand the HPCC Systems open-source repository on GitHub and the functioning of the Regression Test Engine.
Addition of Cleanup Parameter - Add a new cleanup sub-command to the run and query commands of the Regression Test Engine, allowing users to choose the cleanup mode (workunits, passed, or none).
Cleanup Module Development - Design and implement the cleanup module in Python to parse log files generated by the current run of the Regression Test Engine, obtaining workunit URLs to facilitate their deletion via HTTP requests to ECL Watch.
Logging System Development - Develop a custom logging module to log cleanup-related information (successful or failed deletion) for debugging purposes.
Testing and Optimization - Conduct thorough testing of the cleanup functionality in both local and Cloud environments of the HPCC Platform.
Documentation - Document the cleanup feature for both user and developer reference.
Integration - Merge the cleanup module with the original HPCC Platform GitHub repository, ensuring it is live in the production environment.
Results: -
The cleanup module designed and developed during this project allows users to choose cleanup mode as per requirement and when enabled, it effectively deletes approximately 3,000 workunits generated by regression test across different clusters, including Thor, Roxie, and hThor. This significantly reduces the memory usage by workunits in both local and Cloud environments. The reduction in storage requirements in the Cloud environment will also have cost implications, as HPCC Systems utilizes AWS and Microsoft Azure services. Overall, this project has successfully reduced the memory usage and optimized the cost efficiency of the Regression Test Engine for HPCC Systems."
Presentation
In this Video Recording, Goutami provides a tour and explanation of her poster content.
Automated Conditional Cleanup after Regression Testing:
Click on the poster for a larger image.
All pages in this wiki are subject to our site usage guidelines.