Refactoring and releasing PyHPCC

This project is available as a student work experience opportunity with HPCC Systems. Curious about other projects we are offering? Take a look at our Ideas List.

Student work experience opportunities also exist for students who want to suggest their own project idea. Project suggestions must be relevant to HPCC Systems and of benefit to our open source community. 

Find out about the HPCC Systems Summer Internship Program.

Project Description

PyHPCC is a Python package and wrapper built around the HPCC Systems web services that facilitates communication between Python and HPCC Systems. Some core functionalities PyHPCC offers are workunit submission through inline queries or using a Git repository, reading contents from a logical file, uploading files to a landing zone, and making Roxie calls. This internship project aims to enhance the existing PyHPCC code base and make it accessible to the HPCC Systems community. The assigned intern will be responsible for modifying and improving PyHPCC to enable its wider availability to the community.

To accomplish this, the following requirements must be considered:

  • Re-structuring an existing code base to adhere to pep8 conventions.

  • Standardizing requirements to facilitate developer experience by using Poetry.

  • Standardizing method names according to pep8 convention. Example: re-naming existing methods.

  • Enhancing and re-factoring existing methods. Example: Update PyHPCC methods that get responses to requests in JSON rather than XML.

  • Code Style:

    1. Enforce a consistent code style using tools like flake8 or black.

    2. Include a configuration file (ex: .editorconfig).

  • Create a release workflow on GitHub that runs test cases, builds package (using Poetry), and adds the package to repository.

  • Creating documentation

    1. GitHub README file.

    2. Application documentation using Sphinx.

    3. Examples of distinct PyHPCC functions that new users can use in their projects.

  • Additional documentation:

    1. Guidelines for contributing to project.

    2. Pull Request expectations.

    3. Guidelines on releases.

Completion of this project involves:

  • Structured code base containing base functionality re-factored, enhanced, and documented.

  • A GitHub repo on the HPCC Systems organization called PyHPCC that is ready for use by all HPCC Systems’ users.

  • Well documented repository with examples on how to use PyHPCC, and how to contribute to the project.

  • Automated test cases providing coverage towards all implemented methods.

  • An automated GitHub workflow that bundles and creates a Python installable package within the GitHub repo.

  • A blog, a recorded presentation, and a poster artifact about your project (see examples from previous years here).

Mentor

Amila De Silva

Amila.De@lexisnexisrisk.com

Backup Mentor: 

Details coming soon
Details coming soon

Skills needed
  • Familiarity with Python and GitHub

  • General knowledge of APIs

  • Self-motivated to learn about HPCC Systems (guidance will be provided)

Important resources

All pages in this wiki are subject to our site usage guidelines.