Project Description
SVD has many applications. For example, SVD could be applied to natural language processing for latent semantic analysis (LSA). LSA starts with a matrix whose rows represent words, columns represent documents, and matrix values (elements) are counts of the word in the document. It then applies SVD to the input matrix, and uses a subset of most significant singular vectors and corresponding singular values to map words and documents into a new space, called ‘latent semantic space’, where documents are placed near each other measured by co-occurrence of words, even if those words never co-occurred in the training corpus.
The LSA’s notion of term-document similarity can be applied to information retrieval, creating a system known as Latent Semantic Indexing (LSI). An LSI system calculates similarity several terms provided in a query have with
documents by creating k-dimensional query vector as a sum of k-dimensional vector representations of individual terms, and comparing it to the k-dimensional document vectors.
The implementation can be done completely in the ECL language and only a knowledge of ECL and distributed computing techniques is required.
...
Mentor | John Holt |
Skills needed |
|
Deliverables |
|
Other resources |
|