Academic publications supported by the HPCC Systems project
The following papers and publications have been produced over the years by professors and students who have collaborated with us as part of our academic program and a number of LexisNexis Risk Solutions employees. We are proud to have supported our academic partners and colleagues in their research.
Every year, we welcome a number of students on to the HPCC Systems summer intern program and we are equally proud to see that some of them have contributed to published research. Some students have also presented their research as part of our Technical Poster Presentation Contest, held at our annual HPCC Systems Community Summit.
Have you contributed to a paper or publication supported by HPCC Systems in some way? Or do you have a publication coming soon? Please tell us about it and contact us to tell us more about your research.
Are you using HPCC Systems as part of your project? Share your story with us.
Year | Title | Author(s) | Accredited Organization |
---|---|---|---|
2024 | |||
Auto-Detection of Field-Level Dependencies in Data Workflow on a Distributed Platform | Y Surya; Sumanth Hedge; Jyoti Shetty; Shobha G; and Dan Camper | Rashtreeya Vidyalaya College of Engineering, India | |
Local outlier factor for Anomaly Detection in HPCC Systems | Arya Adesh; G Shobha; Jyoti Shetty; and Lili Xu | Rashtreeya Vidyalaya College of Engineering, India | |
Semantic Type Detection or Unlabelled Dataset | Akanksha A Pai; Manoj M; Deeptha Giridhar; Sharon Thomas; Jyoti Shetty; Shobha G | Rashtreeya Vidyalaya College of Engineering, India | |
Synthesizing class labels for highly imbalanced credit card fraud detection data | Robert Kennedy, Flavio Villanustre, Taghi M. Khoshgoftaar, and Zahra Salekshahrezaee | Florida Atlantic University, USA | |
Analyzing Blockchain Data to Detect Bitcoin Addresses Involved in Illicit Activities Using Anomaly Detection | Sarthak Sharan, Divye Sancheti, Dr Shobha G, Jyoti Shetty, Arjuna Chala, Hugo Watanuki | Rashtreeya Vidyalaya College of Engineering, India | |
2023 | |||
Iterative cleaning and learning of big highly-imbalanced fraud data using unsupervised learning | Robert Kennedy, Zahra Salekshahrezaee, Flavio Villanustre, and Taghi M. Khoshgoftaar | Florida Atlantic University, USA | |
Emotions detection in social media posts | Pedro Lima Rodrigues, Renato de Oliveira Moraes, Hugo Watanuki, David de Hilster | University of São Paulo, Brazil | |
Estimating the Number of Clusters for the K-Means Algorithm in a Big Data Context | Bruno Costa, Renato de Oliveira Moraes, Hugo Watanuki | University of São Paulo, Brazil | |
Causal Inference and Conditional Independence Testing with RCoT | Mayank Agarwal , Abhay H. Kashyap , G. Shobha , Jyothi Shetty , and Roger Dev | Rashtreeya Vidyalaya College of Engineering, India | |
Analysis of the Surface Water Quality in the State of Karnataka using Distributed Platform | Shravya Dasu, Shobha G, Jyothi Shetty | Rashtreeya Vidyalaya College of Engineering, India | |
HPCC Systems log monitoring in the cloud (in Brazilian Portuguese) | Nathália Ribas, PAtricia Plentz, Alysson Oliveira, Hugo Watanuki | Universidade Federal de Santa Catarina, Campus Florianópolis, Brazil | |
Illicit Activity Detection in Bitcoin Transactions using Timeseries Analysis | Rohan Maheshwari, Sriram Praveen V A, Shobha G, Jyoti Shetty, Arjuna Chala, Hugo Watanuki | Rashtreeya Vidyalaya College of Engineering, India | |
2022 | |||
Rashtreeya Vidyalaya College of Engineering, India | |||
Optimal lockdown policy for vaccination during COVID-19 pandemic | Yuting Fu, Hanqing Jin, Haitao Xiang, Ning Wang | Oxford University, UK | |
2021 | |||
Big Data and Logistic Regression Applied to Analysis of Loan Requests (in Brazilian Portuguese) | André Fontanez Bravo, Renato de Oliveira Moraes, Hugo Martinelli Watanuki | University of São Paulo, Brazil | |
VR Supermarket: a Virtual Reality Online Shopping Platform with a Dynamic Recommendation System | Rashtreeya Vidyalaya College of Engineering, India | ||
Design and Implementation of HSQL: A SQL-like language for Data Analysis in Distributed Systems | Anurag Singh Bhadauria, Atreya Bain, Jyoti Shetty, Shobha G, Arjuna Chala, Jeremy Clements | Rashtreeya Vidyalaya College of Engineering, India | |
Parallelizing filter-and-verification based exact set similarity joins on multicores | Fabian Fier, Johann-Christoph Freytag | Humboldt University of Berlin | |
Scaling Up Set Similarity Joins Using a Cost-Based Distributed-Parallel Framework | Fabian Fier, Johann-Christoph Freytag | Humboldt University of Berlin | |
Implementation of generative adversarial networks in HPCC systems using GNN bundle | Ambu Karthik, Jyoti Shetty, Shobha G., Roger Dev | Rashtreeya Vidyalaya College of Engineering, India | |
Hybrid Density-based Adaptive Clustering using Gaussian Kernel and Grid Search | Varsha R Jenni, Akhil Dua, G Shobha, Jyoti Shetty, Roger Dev | Rashtreeya Vidyalaya College of Engineering, India | |
Massively scalable density based clustering (DBSCAN) on the HPCC Systems big data platform | Yatish HR, Shubham Milind Phal, Tanmay Sanjay Hukkeri, Lili Xu, Shobha G, Jyoti Shetty, Arjuna Chala | Rashtreeya Vidyalaya College of Engineering, India | |
Modeling and tracking Covid-19 cases using Big Data analytics on HPCC Systems platform | Flavio Villanustre, Arjuna Chala, Roger Dev, Lili Xu, Jesse Shaw, Borko Furst, Taghi Khoshgoftaar | Florida Atlantic University | |
Orquestração de Aplicações de Computação de Alta Performance em Ambientes Cloud Conteinerizados (in Brazilian Portuguese) | Lucas Varella, Patricia Plentz, Hugo Watanuki, Artur Baruchi | Universidade Federal de Santa Catarina, Campus Florianópolis, Brazil | |
Análise massiva de dados na gestão pública: Uma proposta para identificação de outliers no cadastro de imóveis da prefeitura de São Paulo (in Brazilian Portuguese) | Luiz Fernando Cavalcante Silva, Renato de Oliveira Moraes, Hugo Martinelli Watanuki, Leandro Ramos da Silva | University of São Paulo, Brazil | |
2020 | |||
An evaluation of mathematical models for the outbreak of COVID-19 | Oxford University | ||
Survey on RNN and CRF models for De-identification of Medical Free Text | Joffrey Leevy, Taghi Khoshgoftaar, Flavio Villanustre | Florida Atlantic University | |
Massively Scalable Image Processing on the HPCC Systems Big Data Platform | Shobha G, Shubham Milind Phal, | Rashtreeya Vidyalaya College of Engineering, India | |
Parallelizing Filter-Verification Based Exact Set Similarity Joins on Multicores | Fabian Fier, Johann-Christoph Freytag | Humboldt University of Berlin | |
2019 | |||
Machine Learning Techniques to Detect Fraud in Credit Cards on the HPCC Systems Platform | Rashtreeya Vidyalaya College of Engineering, India | ||
Rashtreeya Vidyalaya College of Engineering, India | |||
Rashtreeya Vidyalaya College of Engineering, India | |||
Design and implementation of Machine Learning Evaluation Metrics on HPCC Systems | Rashtreeya Vidyalaya College of Engineering, India | ||
A Parallel and Distributed Stochastic Gradient Descent Implementation Using Commodity Clusters | Robert K.L. Kennedy, Taghi M. Khoshgoftaar, Flavio Villanustre, Tim Humphrey | Florida Atlantic University | |
Random Forest Implementation and Optimization for Big Data Analytics on LexisNexis’s High Performance Computing Cluster Platform | Victor Herrera Cordova, Taghi Khoshgoftaar, Borko Furht, Flavio Villanustre | Florida Atlantic University | |
A Survey of Machine Learning Algorithms Available on the HPCC Systems | Coming soon | Florida Atlantic University | |
Unsupervised annotation of phenotypic abnormalities via semantic latent representations on electronic health records | Jingqing Zhang, Xiaoyu Zhang, Kai Sun, Xian Yang, Chengliang Dai, Yike Guo | Imperial College, London | |
Integrating Semantic Knowledge to Tackle Zero-shot Text Classification | Jingqing Zhang, Piyawat Lertvittayakumjorn, Yike Guo | Imperial College, London | |
Rashtreeya Vidyalaya College of Engineering, India | |||
Harsh Mishra, Jayanth S, Jyoti Shetty, Shobha G, Arjuna Chala, Dan Camper | Rashtreeya Vidyalaya College of Engineering, India | ||
Massively Scalable Parallel KMeans on the HPCC Systems Platform | Lili Xu, Amy Apon, Roger Dev, Flavio Villanustre, Arjuna Chala | Clemson University | |
2018 | |||
Dest-ResNet: a Deep Spatiotemporal Residual Network for Hotspot Traffic Speed Prediction | Binbing Liao, Jingqing Zhang | Imperial College, London | |
Deep Sequence Learning with Auxiliary Information for Traffic Prediction | Binbing Liao, Jingqing Zhang, Chao Wu, Douglas McIlwraith, Tong Chen, Shengwen Yang, Yike Guo, Fei Wu | Imperial College, London | |
HPCC Benchmarking | Rushikesh Ghatpande | North Carolina State University | |
Finding better active learners for faster literature reviews | Zhe Yu, Nicholas A. Kraft, Tim Menzies | North Carolina State University | |
Security Alerting and Event Management in the Era of Machine Learning: Our Experience in the Industry | Flavio Villanustre | LexisNexis Risk Solutions | |
Cervical Cancer Risk Factors: Exploratory Analysis using HPCC Systems | Omosalewa Itauma, Itauma Itauma | Southern New Hampshire University/Wayne State University | |
2017 | |||
Representativeness of latent dirichlet allocation topics estimated from data samples with application to common crawl | Clemson University | ||
ECL-watch: A big data application performance tuning tool in the HPCC systems platform | Clemson University | ||
Large-scale distributed L-BFGS | Maryam M. Najafabadi, Taghi M. Khoshgoftaar, Flavio Villanustre, John Holt | Florida Atlantic University | |
Learning Text to Image Synthesis with Textual Data Augmentation | Hao Dong, Jingqing Zhang, Douglas McIlwraith, Yike Guo | Imperial College, London | |
TensorDB: Database Infrastructure for Continuous Machine Learning | F. Liu, A. Oehmichen, J. Zhang, K. Sun, H. Dong, Y.Mo, Y. Guo | Imperial College, London | |
The Deep Poincare Map: A Novel Approach for Left Ventricle Segmentation | Yuanhan Mo, Fangde Liu, Douglas McIlwraith, Guang Yang, Jingqing Zhang, Taigang He3, and Yike Guo | Imperial College, London | |
Unsupervised deep kernel for high dimensional data | Ying Xie, Linh Le, Jie Hao | Kennesaw State University | |
A sentiment-change-driven event discovery system | Lili Zhang, Ying Xie, Guoliang Liu | Kennesaw State University | |
Trilogy: Data placement to improve performance and robustness of cloud computing | North Carolina State University | ||
2016 | |||
Scalable Dynamic Topic Modeling with Clustered Latent Dirichlet Allocation (CLDA) | Christopher Gropp | Clemson University | |
Automated cluster provisioning and workflow management for parallel scientific applications in the cloud | Brandon Posey, Christopher Gropp, Alexander Herzog, Amy Apon | Clemson University | |
Big Data Technologies and Applications | Borko Furht, Flavio Villanustre | Florida Atlantic University | |
Introduction to Big Data | Borko Furht, Flavio Villanustre | Florida Atlantic University | |
Social Network Analytics: Hidden and Complex Fraud Schemes | Borko Furht, Flavio Villanustre | Florida Atlantic University | |
Modeling Ebola Spread and Using HPCC/KEL System | Ankur Agarwal, Abhishek Jain | Florida Atlantic University | |
The HPCC/ECL Platform for Big Data | David Alan Bayliss, Gavin Halliday, Arjuna Chala, Borko Furht | Florida Atlantic University | |
Deep Learning Techniques in Big Data Analytics | Flavio Villanustre, Taghi M. Khoshgoftaar, Naeem Seliya, Randall Wald, Edin Muharemagic | Florida Atlantic University | |
Visualization of big high dimensional data in a three dimensional space | Ying Xie, Jing He, Pooja Chenna, Linh Le | Kennesaw State University | |
Graph Processing with Massive Datasets: A Kel Primer | Flavio Villanustre | LexisNexis Risk Solutions | |
HPCC Systems for Cyber Security Analytics | Mauricio Renzi | LexisNexis Risk Solutions | |
Unsupervised Learning and Image Classification in High Performance Computing Cluster | Wayne State University | ||
A convex framework for high-dimensional sparse Cholesky based covariance estimation | Kshitij Khare, Syed Rahman, Sang Oh, Bala Rajaratnam | University of Florida | |
2015 | |||
Dynamic Provisioning of Data Intensive Computing Middleware Frameworks: A Case Study | Linh Bao Ngo, Flavio Villanustre, Michael E. Payne, Richard Taylor | Clemson University | |
Assessing the effect of high performance computing capabilities on academic research output | Linh B. Ngo, Michael E. Payne, Paul W. Wilson | Clemson University | |
Deep learning applications and challenges in big data analytics | Maryam M Najafabadi, Flavio Villanustre, Taghi M Khoshgoftaar, Naeem Seliya, Randall Wald, Edin Muharemagic | Florida Atlantic University | |
Industrial big data analytics: lessons from the trenches | Flavio Villanustre | LexisNexis Risk Solutions Group | |
Commercial Big Data Workloads: Lessons from the Industry | Flavio Villanustre | LexisNexis Risk Solutions Group | |
2014 | |||
Return on Investment from Academic Supercomputing | Greg Newby, Amy Apon, Nick Berente, Rudolph Eigenmann, Susan Fratkin, David Lifka, Craig A. Stewart | Clemson University | |
Managing the academic data lifecycle: A case study of HPCC | Clemson University | ||
Using feature selection and classification to build effective and efficient firewalls | Florida Atlantic University | ||
Large-scale entity extraction and probabilistic record linkage | Flavio Villanustre | LexisNexis Risk Solutions Group | |
Big data trends and evolution: a human perspective | Flavio Villanustre | LexisNexis Risk Solutions Group | |
2013 | |||
Academic publishing as a social media paradigm | Clemson University | ||
Efficiency as a Measure of Knowledge Production of Research Universities | Amy W. Apon, Michael E. Payne, Linh Bao Ngo, Paul W. Wilson | Clemson University | |
2011 | |||
Handbook of Data Intensive Computing | Borko Furht, Armando Escalante | LexisNexis Risk Solutions Group | |
Parallel Processing, Multiprocessors and Virtualization in Data-Intensive Computing | Jonathan Burger, Richard Chapman, Flavio Villanustre | LexisNexis Risk Solutions Group |
All pages in this wiki are subject to our site usage guidelines.