Ryan Rao is a rising senior at American Heritage Palm Beach high school. In college, he plans to study computer science. He is on his school's varsity tennis team and math competition club, both of which he has been practicing from a young age. He has also been playing piano since he was 6, although he practices much less now. Some of his other interests include working out, reading, and learning random skills. |
Poster Abstract
HPCC systems is a powerful, open-source tool that can run in a Kubernetes cluster to leverage cloud computing, pods, and containerization in order to solve big data challenges. Kubernetes helps us manage applications made of many containers, such as HPCC, in different deployment environments. However, Kubernetes doesn’t have a built-in storage layer; instead, it provides a standard container storage interface (CSI) to specify how to provision and attach storage to the Kubernetes cluster. This project focuses on enhancing EFS and implementing FSx for Lustre for use by the HPCC cluster on AWS.
There are many different storage services that we can choose from on AWS to attach to the Kubernetes cluster. Two such cloud storages are EFS and FSx, which are fully managed, shared storage services provided by AWS. By attaching these to the Amazon EKS cluster, the HPCC cluster can leverage the features they provide to get shared storage across nodes. In general, there are three storage lifecycles. In an automatic storage lifecycle, storage is created when the HPCC cluster is created, and it is deleted when the HPCC cluster is deleted. It cannot be reused between different HPCC clusters. In a life cycle within Kubernetes, storage lives on the Kubernetes cluster level. This means that it is not deleted when the HPCC cluster is deleted, so it can be reused across different HPCC clusters. Finally, in a life cycle beyond Kubernetes, the storage lives outside of the Kubernetes cluster. It remains even if the Kubernetes cluster is deleted, meaning it can be reused across different Kubernetes clusters. The HPCC platform already has storage implementations for EFS with the 1st and 2nd storage lifecycles, but not the 3rd. Additionally, it has no implementations for storage with FSx. The goal of my project is to create the 3rd storage lifecycle for the EFS implementation to provide users with a more permanent storage and to complete as much as I can of the FSx implementation. This will include configuring all the necessary storage components: PVs, PVCs, storage classes, EFS access points, the CSI driver, etc. In addition, I will build the necessary helm charts and improve any existing code and documentation. Currently, I have completed the EFS implementation, and I am now working on the 2nd storage method for FSx: storage that can be reused across different HPCC clusters. I am grateful to have been able to contribute to the HPCC Systems platform, and I hope that any new features or improvements I have added serve to further enhance the HPCC service, benefiting all its users.
Presentation
In this Video Recording, Ryan provides a tour and explanation of his poster content.
HPCC Systems Storage Support With Container Storage Interface (CSI)
Click on the poster for a larger image.