This project is available as an internship opportunity with HPCC Systems this summer.
Find out more about the HPCC Summer Internship Program.
Deadline for machine learning project proposals - Friday March 25th 2016Curious about other projects we are currently offering? Take a look at our Ideas List. Deadline for non-machine learning project proposals - Friday April 15th 2016
Project Description
SVD has many applications. For example, SVD could be applied to natural language processing for latent semantic analysis (LSA). LSA starts with a matrix whose rows represent words, columns represent documents, and matrix values (elements) are counts of the word in the document. It then applies SVD to the input matrix, and uses a subset of most significant singular vectors and corresponding singular values to map words and documents into a new space, called ‘latent semantic space’, where documents are placed near each other measured by co-occurrence of words, even if those words never co-occurred in the training corpus.
...
- Written the ECL needed to process the text documents into a dataset of term vectors.
Mentor | John Holt Backup Mentor: Edin Muharemagic |
Skills needed |
|
Deliverables |
|
Other resources |
|