...
The HPCC Systems platform currently supports embedding Python, Java, Javascript, R, MySQL and C++ and Cassandra code.The goal of this project is to support MongoDB by allowing the embedding of MongoDB database queries within ECL code running on HPCC Systems.
We are also looking at using similar techniques to provide simple interfaces to some value stores like Ceph, S3 and Kafka (though they may not look like embedded languages). This will allow native reads and writes directly from and to the Object Store, to reduce the extra latency currently created by the requirement to move the data into the internal distributed filesystem prior to processing. The interface to Kafka is already under development.
One of the challenges of this project, is to address how an external key-value store interacts with a distributed thor query so that the external datastore acts like a distributed file read by each node in the thor or where only a portion of a result is written. This is currently something the HPCC Systems developers are looking at and are actively discussing but have not resolved.
The HPCC platform supports hooks to add additional file formats, such as reading directly from archives or from git repositories, and these may be used as the basis for the S3 support.
Additional languages are added to the system via a “plugin” system, and one of the existing plugins such as MySQL (available here), or Python (available here). Use these as examples of the sort of work required. Each completed plugin is considered to be a new feature addition to the HPCC Platform.
...
Mentor | Dan Camper Backup Mentor: Richard Chapman |
Skills needed |
|
Deliverables | Midterm
End of project
|
Other resources |
|