Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The proposal period for 2022 internships is now open
Submit your final proposal to Lorraine Chapman before Friday 18th March 2022

...

This project is already taken and is no longer available for the 2023 HPCC Systems Intern Program

This project is available as a student work experience opportunity with HPCC Systems. Curious about other projects we are offering? Take a look at our Ideas List.

Student work experience opportunities also exist for students who want to suggest their own project idea. Project suggestions must be relevant to HPCC Systems and of benefit to our open source community. 

Find out about the HPCC Systems Summer Internship Program.

Project Description

This project involves...

In the process of doing The introduction of cloud native support for HPCC Systems has opened up a plethora of opportunities for those students looking to learn DevOps skills. One such opportunity is the exploration of best practices for adopting concepts such as Infrastructure as a Code (IaC) for HPCC Systems deployments in the cloud.

Aligned with this context, the goal of this project is to develop a cloud-agnostic opinionated HPCC Systems module using terraform that runs on all the major cloud providers, namely Azure, AWS and GCP. During the development process, you should consider elements such as cloud costs and system performance as part of building these modules. System logging and metrics are also essential to the users running Ecl workload so that engineers can quickly diagnose problems.

Upon successful completion of this work, it is expected that you will be able to identify and recommend:

  • Best practices for things like disk uses and topography
  • VM types, giving reasons for those recommendations
  • Opinionated Terraform modules to use

If you are interested in this project, please contact Add email link to mentor:

  • Have a good working understanding of terraform, and terraform module concepts.
  • Have an excellent knowledge of fundamental differences between the three major cloud providers.
  • Understand what goes into building an opinionated module that balances ease of use, performance, and logging.
  • Optimize performance by striking a balance between Virtual Machine instance types and cost.
  • Learn how to measure performance by running an Ecl Terasort on a cluster.
  • Compare the different storage types each cloud provider offers and their drivers and effectively use them.
  • Have a working understanding of Kubernetes.
  • Have a working knowledge of Helm.
  • Have a working knowledge of Kubernetes charts.
  • Have a working knowledge of docker registries, such as Dockerhub and JFrog.
  • Code in Yaml.

Completion of this project involves:

  • <4+ high level tasks to be completed>

By the mid term review we would expect you to have:

  • <What must be completed to pass the evaluation and continue on to complete the project>

...

Mentor

...

Add Mentor Name
Add link to Email Address 

Backup Mentor: Add Backup Mentor Name
Add link to Email Address 

...

Skills needed

...

  • Provide a simple and opinionated way to build a standard HPCC Systems cluster using a common set of services. This standard pattern should reduce the cognitive load on teams who need to run these clusters.
  • The module should support deploying in an existing network (VPC, VNET).
  • The module should support using an existing storage solution via an import feature.
  • The module should support multiple input values files via an argument.
  • Logging and monitoring are crucial to running and diagnosing cluster-related issues, so the module should use Fluent, Loki, Prometheus, and Grafana to gather metrics and alerts.
Mentor

Wayne Carty
Wayne.Carty@lexisnexisrisk.com 

Backup Mentor: Godson Fortil
Godji.Fortil@lexisnexisrisk.com 

Skills needed
  • Ability to manage HPCC Systems deployments (guidance will be provided).Ability to write test code. Knowledge  Knowledge of ECL is not a requirement since it should be possible to re-use existing code with minimal changes for this purpose. Links are provided below to our ECL training documentation and online courses should you wish to become familiar with the ECL  language.
Deliverables

Midterm

  • <Deliverable(s) to be achieved>

End of project

  • <Deliverables expected by the end of the internship>
  • Cloud fundamental concepts
  • Basic programming knowledge and shell scripting
  • GitHub as a user and a bit as a developer
Other resources