The project proposal application period for 2020 summer internships is now closed. Check back in the Fall for details about applying to join our 2021 program.
Find out more about the HPCC Systems Summer Intern Program including how to apply and read this blog introducing the students and their projects.
7 students joined our intern program in 2020. Our students presented about their projects at our tech talk webcasts during the year and enter our 2020 Poster Contest held at our virtual at our virtual HPCC Systems Community Day Summit held in October 2020.
Due to COVID-19 all internships were completed remotely.
Meet the Class of 2020
Name | Project Title | Description | Mentor(s) | Resources |
---|---|---|---|---|
Jack Fields High School Student | Using the GNN Bundle with TensorFlow to train a model to find known faces | Process the data from collected images using our Generalized Neural Network (GNN) Bundle with TensorFlow to train a model that can recognise known faces. This supports the work of the AHS Robotics Team who are building an Autonomous Security Robot (Watch Demo) that can recognise potential risks on a school campus that might otherwise be missed by the human eye. Using object and facial recognition, they can capture faces and recognise them with 93% accuracy using Tensorflow. | David DeHilster |
Jefferson Mao High School Student | Establish HPCC Systems on the Google Cloud Platform | Work through the steps required to use HPCC Systems on the Google Cloud platform. Design a web application for creating new HPCC Systems cluster on this cloud service. Exploring Google Cloud Anthos, (a new Google Kubernetes deployment platform), with an HPCC Systems cluster. Analysing how running HPCC Systems on the Google Cloud works in comparison with other cloud services (such as AWS), looking at performance, security and cost effectiveness | Xiaoming Wang | |
Matthias Murray Masters in Data Science | Applying HPCC Systems Word Vectors to SEC Filings | Report on the current status of vectorisation and NLP representation of SEC filings and then compile identified SEC filing cases and their intersection from a LexisNexis perspective. Sort and transform SEC data, creating a function to convert the data into a format required by the HPCC Systems Word Vectors ML bundle. | Lili Xu | |
Nathan Halliday High School Student |
Execute Multiple Workflow Items in Parallel | Restructure the workflow engine to create a graph of tasks that can be used to track which tasks have been executed and which tasks should be executed next. Ensure that there are no multi-threading issues in the workflow engine. the plan is to support ROXIE and Thor | Gavin Halliday |
Community Day presentation
Watch Recording View Slides
Robert Kennedy Masters in Computer Science | Implement a Multi-node, Multi-GPU Accelerated Deep Learning Algorithm using GNN | Expand on our existing GNN bundle to improve our GPU accelerated neural network training. The aim is that HPCC Systems will be able to train neural networks, at scale, across many GPUs, across many GPU enabled nodes using different parallelisation techniques that are suited to deep learning tasks. Increase the robustness of the underlying GNN library by identifying areas for improvement while documenting best practices to be used when training neural networks on GPUs using the GNN bundle. | Tim Humphrey |
Watch RecordingView Slides
Vannel Zeufack Masters in Computer Science | Implement a Preprocessing Bundle for the HPCC Systems ML Library | Make the data preprocessing phase of machine learning on HPCC Systems easier and faster. Produce a preprocessing bundle tutorial to demonstrate how the different modules in the preprocessing bundle could be used together to easily prepare data for a machine learning project | Arjuna Chala |
Yash Mishra Masters in Computer Science | Leveraging and evaluating Kubernetes support on Microsoft Azure | Use our new Cloud native platform to leverage the Kubernetes support for HPCC Systems, focusing on performance measurements, cost analysis, looking at various configuration options. Provide a comparison of running the HPCC Systems bare metal version and the new K8 support of cloud native HPCC Systems on Microsoft Azure. | Dan Camper |
Profile of our intern program in 20192020
- 10
7 students -
1 BTech, 4 undergraduate3 High School, 3 Masters, 1 PhD
and 1 PostDoc Global and inclusive program, with
four studentsone student located in
IndiaEurope (UK) and 2 international students
from Jordan and China,studying in the USA
and one student located in Europe (UK).
2 returning students
- 5
All remote
workers (working
from home) including one student located in India and 5 office based in LexisNexis offices in Alpharetta, GA and Boca Raton, FL Spread of projects:
82 Cloud, 4 Machine Learning,
2 HPCC Systems platform, 18 mentors involved - 3 new RELX mentors, 71 Core Platform
14 mentors involved including 2 academic mentors
HPCC Systems platform related projects
- Interface Octave with ECL
- Cluster deployment with Juju charm
Establish HPCC Systems on the Google Cloud Platform
Execute Multiple Workflow Items in Parallel
Leveraging and evaluating Kubernetes support on Microsoft Azure
Machine learning related projects
- Analysing telematics data to support the connected cars industry *
- Machine Learning and the Forensic Applications of Audio Classification: An exploration of the forensic applications of sound classification using Artificial Neural Nets *
- Cleaning and analysis of collegiate soccer GPS data in HPCC Systems *
- Domain based common words list using high dimensional representation of words *
- Create HPCC Systems on Hyper V *
- Fraud detection in value based cards
- Evaluation of machine learning algorithms
- Develop and assess unsupervised anomaly detection methods using HPCC Systems
Applying HPCC Systems Word Vectors to SEC Filings
Implement a Multi-node, Multi-GPU Accelerated Deep Learning Algorithm using GNN *
Implement a Preprocessing Bundle for the HPCC Systems ML Library
Using the GNN Bundle with TensorFlow to train a model to find known faces *
* Projects suggested by students themselves