Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

About Lucas Varella

Lucas Varella is studying for a Bachelor of Information Systems at the Federal University of Santa Catarina (UFSC) in Brazil. His university course covers a broad range of subjects including Object Oriented Programming, Data Structures, Database (Relational and up to newSQL), as well as some marketing and business modules. Lucas is working with Hugo Watanuki (Senior Technical Support Engineer, LexisNexis Risk Solutions) and is completing a year long internship in the Brazil office.

Poster Abstract

The advent of containers and their associated orchestration tools in recent years have fundamentally   shifted   how  computational   workloads   are  built  and   managed   in distributed computing environments. Whereas containers offer a consistent lightweight runtime  environment  through  OS-level  virtualization,  as  well  as  low  overhead  to maintain and scale applications with high efficiency; the management of containers is controlled via container orchestrators. Container orchestration tools, such as Kubernetes, have  a mechanism  to launch  and manage  containers  as clusters  or pods,  providing automation for running service containers. Orchestration, therefore, provides a flexible way of scaling services  running inside a container that require load balancing,  fault tolerance, and horizontal scaling.
 
However,  not  all  distributed  computing  environments  can  be  easily  ported  to  the container orchestration paradigm. This migration can constitute a bigger challenge for data  intensive  supercomputing  technologies  such  as  the  HPCC  (High  Performance Computing Cluster) Systems platform. This is due to the batch queuing nature of most of these platforms that possess strict assumptions around data storage persistence and host-specific shared resources, such as: each node must securely maintain its own set of data and will be reading and writing to a single shared file system. Especially for the HPCC  Systems  platform,  which  historically  relies  on  data  locality,  the  migration towards container orchestration paradigm can represent a particular challenge.
 
Despite this challenging scenario, and given the push toward containerization  trends, advances  have been made at some extent to make data intensive  platforms such as HPCC Systems available in containerized environments running in public clouds. How this new platform architecture behaves from a functionality and performance standpoint across different public cloud providers; and in  comparison to the original bare-metal architecture, is still a question whose answer is mostly unknown.
 
The overall objective of this in progress study is to explore the usage of the first HPCC Systems version with native support for containerization. To this end, a cross provider experiment  will  be  executed  to  compare  overall  HPCC  Systems  performance  and functionality among Azure Kubernetes  Service (AKS),  Amazon´s  Elastic Kubernetes Service (EKS) and bare metal. A benchmark test suite will be utilized to measure data transformation  performance.  It is expected  that this study will contribute  to a better understanding of how the recent released HPCC Systems version with native support for containerization behaves in terms of performance and functionality, as well as provide insights into future developments.

Presentation

In this Video Recording, Lucas provides a tour and explanation of his poster content.

Poster Title: A Cross Provider Assessment for HPCC Systems Container Orchestration

Click on the poster for a larger image. The original PDF version can be found here. (Available for download).

Image Added