Tech Talk 36 - September 17th 2020
11am ET
Guest Speakers and subjects:
Matthias Murray, Masters in Data Science, New College of Florida - Watch Recording
HPCC Systems Intern 2020
Applying HPCC Systems Word Vectors to SEC Filings
Matthias Murray graduates with a Masters in Data Science in 2020. He previously studied a BA in Maths and Physics also at New College of Florida, producing a thesis on Thin Film Fracture and Finite Element Analysis Fundamentals.This project involves reporting on the current status of vectorisation and NLP representation of SEC filings and then compiling identified SEC filing cases and their intersection from a LexisNexis perspective. He will need to sort and transform SEC data, creating a function to convert the data into a format required by the HPCC Systems Word Vectors ML bundle. More information about this project is available in the associated JIRA issue.
Matthias has lots of ideas about how how the results of his project may be of practical help in a business setting, including providing a tool for calling particular filing details for a specific company, predictions such as expected analyst rating upgrades/downgrades before they are officially issued and providing a visualisation tool on extracted filings showing interesting patterns.
Matthias's mentor is Lili Xu, Software Engineer III, LexisNexis Risk Solutions Group but he is also being supported by Professor Burcin Bozkaya from New College of Florida.
Robert Kennedy, Research Assistant, Florida Atlantic University - Watch Recording
HPCC Systems Intern 2020
Implementing a Multi-node, Multi-GPU Accelerated Deep Learning Algorithm using GNN
Robert Kennedy joins our intern program for the third year running.During his 2020 internship, he aims to expand on our existing GNN bundle to improve our GPU accelerated neural network training. By the end of his internship, HPCC Systems will be able to train neural networks, at scale, across many GPUs, across many GPU enabled nodes using different parallelisation techniques that are suited to deep learning tasks. Robert's work will increase the robustness of the underlying GNN library by identifying areas for improvement while documenting best practices to be used when training neural networks on GPUs using the GNN bundle. More information about this project is available in the associated JIRA issue.
Throughout his interns projects he has been supported by Dr Taghi Khoshgoftaar, Florida Atlantic University, who is an old friend of the HPCC Systems Open Source Project. As in all previous years, Robert's mentor is Tim Humphrey, Consulting Software Engineer, LexisNexis Risk Solutions Group.
You can find out about Robert's previous projects using the links below. He has also entered our poster contest during his previous internships, placing third in 2018 and a well deserved first in 2019.
GPU Accelerated Neural Networks on HPCC Systems - Tech Talk Presentation / View Poster / Community Day 2019 Presentation (Watch Recording View Slides)
Begin development of a software library that would provide HPCC Systems distributed neural network training - Tech Talk Presentation / View Poster / Community Day 2018 Presentation ( Watch Recording / View Slides)
Vannel Zeufack, Masters in Computer Science, Kennesaw State University - Watch Recording
HPCC Systems Intern 2020
Implementing a Preprocessing Bundle for the HPCC Systems ML LibraryVannel Zeufack is a second year Masters Student at KSU and a returning HPCC Systems intern. In 2019, he developed a KMeans-based anomaly detection system applied to network systems' log files. If you want to know more about this project, listen to Vannel present at our Tech Talk in September 2019. You can also see the poster he entered into our 2019 Technical Poster Contest, where he placed third. His 2019 blog Journal also provides details about the progress made on his project as well as his HPCC Systems internship experience. In 2020, he developed a Preprocessing Bundle for HPCC Systems Machine Learning library to facilitate the data preparation phase of Machine Learning projects undertaken on HPCC Systems platform. Vannel is an aspiring software engineer, with interests in developing high quality, reliable and positively impactful software.
Vannel's project this year is quite different to the one he completed in 2019. The purpose of his 2020 project is to make the data preprocessing phase of machine learning on HPCC Systems easier and faster. He also plans to produce a preprocessing bundle tutorial to demonstrate how the different modules in the preprocessing bundle could be used together to easily prepare data for a machine learning project. More information about this project is available in the associated JIRA issue.
Vannel's mentor is Lili Xu, Software Engineer III, LexisNexis Risk Solutions Group.
All pages in this wiki are subject to our site usage guidelines.