Adding dataset support to the HPCC Systems Wasm plugin

This project is available as a student work experience opportunity with HPCC Systems. Curious about other projects we are offering? Take a look at our Ideas List.

Student work experience opportunities also exist for students who want to suggest their own project idea. Project suggestions must be relevant to HPCC Systems and of benefit to our open source community. 

Find out about the HPCC Systems Summer Internship Program.

Project Description

WebAssembly or wasm is a low-level binary format for the web that is compiled from other languages to offer maximized performance and augments JavaScript. As a result developers can code solutions and applications using their own language, compile then to wasm binary files and run them on web browsers with performance similar to a native environment. The HPCC Systems platform currently supports embedding wasm via a plugin. Given its flexibility and security characteristics, wasm has shown to be an attractive solution to expand even further the ECL support to embedded languages in a secure enclave, thus eliminating the need to sign the embedded code. However, the plugin is still under development and requires further expansion.

The goal of this project is to add dataset support to the wasm plugin so data structures contained in compiled wasm modules can be manipulated from within ECL code running on HPCC Systems.

Requirements and considerations for this project:

  • The initial implementation of the wasm plugin is already available and should be leveraged (see repository in the "Important Resources" section below).

  • Coverage of all data types both passed in and returned, including multi-threaded access from the ECL side.

  • Testing the performance and throughput of the system for some examples that approximate to real-world usage.

  • The conceptual approach for this project is similar to the one used in the development of the parquet plugin (see repository in the "Important Resources" section below).

Completion of this project involves:

  • Extending the simple wrapper to handle structured data between the ECL embed and wasm.

  • Test cases demonstrating the correct behavior and performance of the plugin.

  • A complete GitHub project with code and documentation of how datatypes and structures in ECL are mapped to wasm.

  • A blog, a recorded presentation, and a poster artifact about your project (see examples from previous years here).

Mentor

Gordon Smith
Gordon.Smith@lexisnexisrisk.com

Backup Mentor: 

TBD

Skills needed
  • Ability to code in C++.

  • Ability to build and test the HPCC system (guidance will be provided).

  • Knowledge of wasm sufficient to write and run test cases.

  • Knowledge of ECL sufficient to re-use existing code (see resources below to become familiar with the ECL  language).

  • Ability to write test code.

Important resources

All pages in this wiki are subject to our site usage guidelines.