The proposal application period for the 2021 HPCC Systems intern Program is now open.
The deadline date for proposal applications is Friday 19th March 2021.
Discuss your ideas with the project mentor and send your final proposal to Lorraine Chapman.
This project was completed by Everett Matt Upchurch Butler, an undergraduate studying for a BS in Information Technology. Matt completed this project as an intern in 2018, joining us earlier than most in the January of that year and completing his internship before the summer started. Most students join over the summer months, but we can be flexible to accommodate study schedules and commitments.
Find out about the HPCC Systems Summer Internship Program.
Project Description
A standard Math library would expand the ECL language greatly. Math is constant and under normal circumstances it does not change as in a dependent relationship. It is a critical component of nearly every aspect of society and is therefore extremely useful in countless applications. ECL has already shown a massive improvement over its predecessors, so it is not difficult to contend that the separation between them, in terms of completeness and efficiency, can be widened even further with this project’s implementation. For this reason, a standard Math library should be created. Specifically, the functions involving probability distributions need to be implemented. The probability distributions discussed in this proposal cover the vast majority of those that are considered useful and will go a long way to add to the capabilities of ECL.
If you are interested in this project, please contact John Holt.
Completion of this project involves:
FUNCTION to add Beta Distribution
FUNCTION to add Binomial Distribution
FUNCTION to add Non-central Chi square Distribution
FUNCTION to add F Distribution
FUNCTION to add Non-central F Distribution
FUNCTION to add Gamma Distribution
FUNCTION to add Negative Binomial Distribution
FUNCTION to add Poisson Distribution
Test to determine accuracy and speed of PDFs
Documentation regarding the sources of these approximations
FUNCTION to estimate parameters of the PDFs
FUNCTION to calculate the inverse of the PDFsWish list
Wish list
- FUNCTION to find the best fitting distribution given a random dataset 14. FUNCTION to add Trigonometric Functions
- FUNCTION to add Logarithmic Functions
- FUNCTION to add additional Probability Distributions
By the mid term review we would expect you to have:
- Establish familiarity and congruence with mentor in terms of project goals, implementation timeline, and action plan. Assess current efficiency and structure of existing probability distributions. There are currently 3 in the ML_Core repo: Normal, Chi squared, and Student T’s. Develop and design workflow from data gathering to final implementation and testing.
- Perform further in depth research on the different Probability Distributions.Decide which Probability Distributions should be included. Priority should be based on perceived usefulness Design PDF framework. Additional Distributions can be added later after framework has been created. Begin searching for algorithms that are not intellectually restricted.
- Continue gathering the necessary mathematic algorithms. A period of two weeks is dedicated to this task. This is due to the high probability that there will be some difficulty collecting algorithms that are not intellectually protected. Documentation of the sources of these approximations is critical here.
Create a pseudo code implementation. This step is critical because it is generally prohibited to copy an existing algorithm from another language.
Mentor | Roger Dev Backup Mentor: John Holt |
Skills needed |
|
Deliverables | Midterm
End of project
|
Other resources |
|