Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

Project Description

The implementation can be done completely in the ECL language and only a knowledge of ECL and distributed computing techniques is required. 

Completion of this project involves:

  • Generation of the test data. Three sets of test data are required, one for each of the three test cases:
    • Somewhat uniform distribution where each node has data from the entire range.
    • Skewed data where at least half of the nodes do not have observations in at least 50% of the range.
    • Highly skewed data where range overlaps do not occur.
  • Development of the algorithm using ECL.
  • Testing the algorithm for correctness and performance, which involves comparing the approximate solution to the exact solution and validating that that the results are within the tolerance specified.

By the GSoC mid term review we would expect you to have written the ECL needed to generate the test data for the three cases

Mentor

John Holt
Contact details: Contact Details

Skills needed
  • Knowledge of ECL. Training manuals and online courses are available on the HPCC Systems website.
  • Knowledge of distributed computing techniques
Deliverables
  • Test code demonstrating the correctness and performance of the algorithm.
  • Supporting documentation.
Other resources
  • No labels