Project Description
The implementation can be done completely in the ECL language and only a knowledge of ECL and distributed computing techniques is required.
Completion of this project involves:
- Generation of the test data. Three sets of test data are required, one for each of the three test cases:
- Somewhat uniform distribution where each node has data from the entire range.
- Skewed data where at least half of the nodes do not have observations in at least 50% of the range.
- Highly skewed data where range overlaps do not occur.
- Development of the algorithm using ECL.
- Testing the algorithm for correctness and performance, which involves comparing the approximate solution to the exact solution and validating that that the results are within the tolerance specified.
By the GSoC mid term review we would expect you to have written the ECL needed to generate the test data for the three cases
Mentor | John Holt |
Skills needed |
|
Deliverables |
|
Other resources |
|