HPCC Systems intern Program - Class of 2020
Find out more about the HPCC Systems Summer Intern Program including how to apply and read this blog introducing the students and their projects.
7 students joined our intern program in 2020. Our students presented about their projects at our tech talk webcasts during the year and enter our 2020 Poster Contest held at our virtual HPCC Systems Community Day Summit held in October 2020.
Due to COVID-19 all internships were completed remotely.
Meet the Class of 2020
Name | Project Title | Description | Mentor(s) | Resources |
---|---|---|---|---|
Jack Fields High School Student | Using the GNN Bundle with TensorFlow to train a model to find known faces | Process the data from collected images using our Generalized Neural Network (GNN) Bundle with TensorFlow to train a model that can recognise known faces. This supports the work of the AHS Robotics Team who are building an Autonomous Security Robot (Watch Demo) that can recognise potential risks on a school campus that might otherwise be missed by the human eye. Using object and facial recognition, they can capture faces and recognise them with 93% accuracy using Tensorflow. | David DeHilster | Tech Talk Presentation, August 2020 |
Jefferson Mao High School Student | Establish HPCC Systems on the Google Cloud Platform | Work through the steps required to use HPCC Systems on the Google Cloud platform. Design a web application for creating new HPCC Systems cluster on this cloud service. Exploring Google Cloud Anthos, (a new Google Kubernetes deployment platform), with an HPCC Systems cluster. Analysing how running HPCC Systems on the Google Cloud works in comparison with other cloud services (such as AWS), looking at performance, security and cost effectiveness | Xiaoming Wang | |
Matthias Murray Masters in Data Science | Applying HPCC Systems Word Vectors to SEC Filings | Report on the current status of vectorisation and NLP representation of SEC filings and then compile identified SEC filing cases and their intersection from a LexisNexis perspective. Sort and transform SEC data, creating a function to convert the data into a format required by the HPCC Systems Word Vectors ML bundle. | Lili Xu | |
Nathan Halliday High School Student | Execute Multiple Workflow Items in Parallel | Restructure the workflow engine to create a graph of tasks that can be used to track which tasks have been executed and which tasks should be executed next. Ensure that there are no multi-threading issues in the workflow engine. the plan is to support ROXIE and Thor | Gavin Halliday | |
Robert Kennedy Masters in Computer Science | Implement a Multi-node, Multi-GPU Accelerated Deep Learning Algorithm using GNN | Expand on our existing GNN bundle to improve our GPU accelerated neural network training. The aim is that HPCC Systems will be able to train neural networks, at scale, across many GPUs, across many GPU enabled nodes using different parallelisation techniques that are suited to deep learning tasks. Increase the robustness of the underlying GNN library by identifying areas for improvement while documenting best practices to be used when training neural networks on GPUs using the GNN bundle. | Tim Humphrey | |
Vannel Zeufack Masters in Computer Science | Implement a Preprocessing Bundle for the HPCC Systems ML Library | Make the data preprocessing phase of machine learning on HPCC Systems easier and faster. Produce a preprocessing bundle tutorial to demonstrate how the different modules in the preprocessing bundle could be used together to easily prepare data for a machine learning project | Arjuna Chala | |
Yash Mishra Masters in Computer Science | Leveraging and evaluating Kubernetes support on Microsoft Azure | Use our new Cloud native platform to leverage the Kubernetes support for HPCC Systems, focusing on performance measurements, cost analysis, looking at various configuration options. Provide a comparison of running the HPCC Systems bare metal version and the new K8 support of cloud native HPCC Systems on Microsoft Azure. | Dan Camper |
Profile of our intern program in 2020
7 students - 3 High School, 3 Masters, 1 PhD
Global and inclusive program, with one student located in Europe (UK) and 2 international students studying in the USA.
2 returning students
All remote working
Spread of projects: 2 Cloud, 4 Machine Learning, 1 Core Platform
14 mentors involved including 2 academic mentors
HPCC Systems platform related projects
Establish HPCC Systems on the Google Cloud Platform
Execute Multiple Workflow Items in Parallel
Leveraging and evaluating Kubernetes support on Microsoft Azure
Machine learning related projects
Applying HPCC Systems Word Vectors to SEC Filings
Implement a Multi-node, Multi-GPU Accelerated Deep Learning Algorithm using GNN *
Implement a Preprocessing Bundle for the HPCC Systems ML Library
Using the GNN Bundle with TensorFlow to train a model to find known faces *
* Projects suggested by students themselves
All pages in this wiki are subject to our site usage guidelines.