Ryan Rao is a rising senior at American Heritage Palm Beach high school. In college, he plans to study computer science. He is on his school's varsity tennis team and math competition club, both of which he has been practicing from a young age. He has also been playing piano since he was 6, although he practices much less now. Some of his other interests include working out, reading, and learning random skills. |
Poster Abstract
HPCC systems is a powerful, open-source tool that can run in a Kubernetes cluster to leverage
cloud computing, pods, and containerization in order to solve big data challenges. Kubernetes
helps us manage applications made of many containers, such as HPCC, in different deployment
environments. However, Kubernetes doesn’t have a built-in storage layer; instead, it provides a
standard container storage interface (CSI) to specify how to provision and attach storage to the
Kubernetes cluster. This project focuses on enhancing EFS and implementing FSx for Lustre for
use by the HPCC cluster on AWS.
There are many different storage services that we can choose from on AWS to attach to the
Kubernetes cluster. Two such cloud storages are EFS and FSx, which are fully managed, shared
storage services provided by AWS. By attaching these to the Amazon EKS cluster, the HPCC
cluster can leverage the features they provide to get shared storage across nodes.
In general, there are three storage lifecycles. In an automatic storage lifecycle, storage is created
when the HPCC cluster is created, and it is deleted when the HPCC cluster is deleted. It can not
be reused between different HPCC clusters. In a life cycle within Kubernetes, storage lives on
the Kubernetes cluster level. This means that it is not deleted when the HPCC cluster is deleted,
so it can be reused across different HPCC clusters. Finally, in a life cycle beyond Kubernetes, the
storage lives outside of the Kubernetes cluster. It remains even if the Kubernetes cluster is
deleted, meaning it can be reused across different Kubernetes clusters.
The HPCC platform already has storage implementations for EFS with the 1st and 2nd storage
lifecycles, but not the 3rd. Additionally, it has no implementations for storage with FSx. The goal
of my project is to create the 3rd storage lifecycle for the EFS implementation to provide users
with a more permanent storage and to complete as much as I can of the FSx implementation.
This will include configuring all the necessary storage components: PVs, PVCs, storage classes,
EFS access points, the CSI driver, etc. In addition, I will build the necessary helm charts and
improve any existing code and documentation.
Currently, I have completed the EFS implementation, and I am now working on the 2nd storage
method for FSx: storage that can be reused across different HPCC clusters. I am grateful to have
been able to contribute to the HPCC Systems platform, and I hope that any new features or
improvements I have added serve to further enhance the HPCC service, benefiting all its users.
Presentation
In this Video Recording, Ryan provides a tour and explanation of his poster content.
HPCC Systems Storage Support With Container Storage Interface (CSI)
Click on the poster for a larger image.
Insert Poster Image