Develop and Assess Unsupervised Anomaly Detection Methods

This project was completed as a intern opportunity with HPCC Systems in 2019. Curious about projects we are offering for future internships? Take a look at our Ideas List. 

Find out about the HPCC Systems Summer Internship Program.

Project Description

Create an ECL Bundle encompassing several widely used Anomaly Detection algorithms.  Identify at least one algorithm that is parallelizable and implement efficiently in ECL:

  • Identify a publicly available dataset and method for assessing anomalies in the data

  • Research state of the art Anomaly Detection algorithms and choose two or more for implementation

  • Implement methods in ECL on HPCC Systems cluster, and assess results in identifying target anomalies.

By the mid term review we would expect you to have:

  • To have implemented atelast one algorithm and tested it.

Mentor

Roger Dev
Contact Details

Backup Mentor: TBD
Contact Details 

Skills needed
  • Knowledge of ECL. Training manuals and online courses are available on the HPCC Systems website.

  • Knowledge of distributed computing techniques

  • Knowledge of anomaly detection algorithms

Deliverables
  • Checked in code for the implementation in github

  • Determine an appropriate dataset for testing the algorithms.

  • Test code demonstrating the correctness and performance of the algorithm.

  • Supporting documentation.

Other resources

All pages in this wiki are subject to our site usage guidelines.