Explore using Spot Instances on Azure and AWS

This project was completed by a student accepted on to the 2021 HPCC Systems Intern Program.

Project suggestions must be relevant to HPCC Systems and of benefit to our open source community. 

Find out about the HPCC Systems Summer Internship Program.

Project Description

An increasing number of companies are moving to cloud environments. However, taking full advantage of the cloud infrastructure, particularly to get the job done while minimizing cost, remains a challenge. This project will try to explore setup Kubernetes cluster, CI/CD build and test with spot instance on Azure and AWS.

For Kubernetes setup we currently use Azure Kubernetes Service (AKS) on Azure and Elastic Kubernetes Service (EKS) on AWS. Both should support spot instances but need some special treatment. This has high priority than following CI/CD tasks. We hope there is measurement (AZ/AWS client program) can estimate possibility if we can get the instances. Also it will be nice to either query AZ/AWS or write the function to get trend of spot instances.  

For CI/CD build: Current HPCC Systems development works, such as nightly builds, some function tests, etc.  are done through Jenkins with AWS instance. Spot Instances are much cheaper than their on-demand counterparts.  However, we are not sure pre-build AWS AMI  and Azure VHD which have pre-requisites suitable for HPCC Systems Development will be able to adopted to spot instances. Thus, we may need to provision the spot instance before it can be used for our development purposes. As another option, we can create a Docker image with all of the prerequisite packages. 

We This project requires knowledge of  AZ/AWS  CLI, AKS and EKS, Docker, Unix Shell, Jenkins, maybe Python and also HPCC Systems knowledge.  The students may not have all these skills but should be eager to learn the required tools or technologies.

If you are interested in this project, please contact Gordon Smith. 

Completion of this project involves:

  • Deep understand how spot instance work in Azure and AWS

  • Create AZ/AWS CLI scripts to estimate possibility, monitor trend and calculate money saved.

  • create AZ/AWS CLI scripts to setup Kubernetes cluster. Do Azure first.

  • Enable spot instances for  our nightly build on  at least one supported Linux distro, such as Ubuntu 20.04 or CentOS 7. It will be on AWS.

  • Use Jenkins and AWS Spot Instances  for some of HPCC Systems Development work, at least nightly build

  • Documentation

  • Explore the potential spot instance usage for HPCC Systems related jobs

By the mid term review we would expect you to have:

  • Enabled spot instances for  our nightly build on  at least one supported Linux distro, such as Ubuntu 18.04 or CentOS 7. It will be on AWS.

Mentor

Godson Fortil
Godson.Fortil@lexisnexisrisk.com

Backup Mentor: Xiaoming Wang Xiaoming.Wang@lexisnexis.com



Skills needed
  • General Cloud Environment knowledge, particularly AKS and EKS

  • Azure/AWS  Client API (shell or Python), S3, Docker, Jenkins, Packer

  • Unix Shell, Python

  • Ability to build and test the HPCC system (guidance will be provided).

  • Ability to write test code. Knowledge of ECL is not a requirement since it should be possible to re-use existing code with minimal changes for this purpose. Links are provided below to our ECL training documentation and online courses should you wish to become familiar with the ECL  language.

Deliverables

Midterm

  • A script to estimate the  possibility of getting desired spot instance.

  • Basic scripts to create AKS cluster and apply to HPCC Cloud testing

  • Enable spot instances for  our nightly build on  at least one supported Linux distro, such as Ubuntu 18.04 or CentOS 7. I will be on AWS.

End of project

  • Complete scripts for spot instance estimating possibility, monitoring trend and calculating money saved.

  • Complete scripts for AKS and EKS creation.

  • Use Jenkins and AWS Spot Instances  for some of HPCC Systems Development work, at least nightly build

  • Documentation

  • A github project to host code and documentation

  • Explore the potential spot instance usage for HPCC Systems related jobs

Other resources

All pages in this wiki are subject to our site usage guidelines.