Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Contributions to HPCC Systems - From Virtual Collaboration to Virtual Reality
    Dr G Shobha, RV College of Engineering

    This talk focuses on the virtual collaborative work done between RV College of Engineering and LexisNexis Risk Solutions on recent contributions to the HPCC Systems Platform. These include plugins and extending Machine Learning bundles for HPCC Systems, followed by analysing the impact of skewed data distributions on most commonly used ECL operations. The talk concludes with case studies executed on HPCC Systems, including the implementation of a virtual reality application.
  • HSQL: An SQL-like Language for HPCC Systems
    Atreya Bain, RV College of Engineering & HPCC Systems Intern 2021 and Mahdi Kashani, LexisNexis Risk Solutions Group

    There is a steep learning curve to getting used to handling Big Data, especially in distributed systems, where the task of data processing is split amongst various nodes in clusters.
    HSQL is the new big-data query language of HPCC Systems and is an innovative and open- source solution to let users process their data at any scale. It is designed to work in conjunction with ECL which is the primary programming language for HPCC Systems, and it should prove itself to be easy to work with and robust for general purpose analysis. Made to provide a compact and easy to comprehend SQL-like syntax for performing visualizations, general data analysis, training of Machine Learning models, HSQL allows a modular structure to such programs and can easily integrate with VS Code IDE. In this presentation, learn why HSQL is important and how it adds more value to HPCC Systems users, its syntax, and see a couple of examples on different datasets and its installation and setup instructions.
  • New Advancements to Logistic Regression and the ML Library
    Lili Xu, LexisNexis Risk Solutions Group

    Logistic Regression is one of the most important analytic tools in the social and natural sciences such as natural language processing and image recognition. One of our Machine Learning advancements is to renovate the current HPCC Systems Logistic Regression bundle and add the ability to handle both binary and multi-classes predictions tasks. Another advancement is to improve the performance and remove the bottlenecks of the Preprocessing bundle. The improved version is more scalable and more efficient for Big Data preprocessing tasks.

  • The Causality Analytics Toolkit for HPCC Systems
    Roger Dev, LexisNexis Risk Solutions Group

    Causal Reasoning is at the heart of most human thought and action, yet has only recently been formalized as a mathematical and scientific field of study. It is hard to conceive of achieving a true AI without such a capability. Although the science of Causality has not advanced to the threshold of AI, it can unlock capabilities that are beyond the realm of statistical observation. Current Machine Learning methods assess observational patterns, and learn to replicate the results of patterns previously detected. They make no effort to disentangle true causal effects from observed correlation. They lack the ability to respond to changes in the scenarios that generated the data, or to predict the effect of new actions on the outcome. Causal Science provides a path toward a deeper understanding of our data. It defines mechanisms that can separate causal influences from spurious correlation and infer causal effects from observational data. As these techniques evolve, they stand to revolutionize our understanding and uses of data. Causality 2021 is an HPCC Systems research and development program. The goal is to increase our understanding of the latest causal algorithms, assess and challenge the current state-of-the art, and develop a Causality Toolkit for HPCC Systems Platform. This project encompasses all three levels of the "Ladder of Causality": “Seeing”, “Doing”, and “Imagining”, as well as Causal Model Validation, and Causal Discovery. This project includes work from three interns who joined the HPCC Systems Intern Program in 2021.

  • The Forecast of COVID-19 Spread Risk at The County Level
    Murtadha Hssayeni, Florida Atlantic University

    The early detection of the coronavirus disease 2019 (COVID-19) outbreak is important to save people's lives and restart the economy quickly and safely. People's social behavior, reflected in their mobility data, plays a major role in spreading the disease. Therefore, we used the daily mobility data aggregated at the county level beside COVID-19 statistics and demographic information for short-term forecasting of COVID-19 outbreaks in the United States. The daily data are fed to a deep learning model based on Long Short-Term Memory (LSTM) to predict the accumulated number of COVID-19 cases in the next two weeks. A significant average correlation was achieved (r=0.83 (p=0.005)) between the model predicted and actual accumulated cases in the interval from August 1, 2020 until January 22, 2021. The model predictions had r > 0.7 for 87% of the counties across the United States. A lower correlation was reported for the counties with total cases of <1,000 during the test interval. The average mean absolute error (MAE) was 605.4 and decreased with a decrease in the total number of cases during the testing interval. The model was able to capture the effect of government responses on COVID-19 cases. Also, it was able to capture the effect of age demographics on the COVID-19 spread. It showed that the average daily cases decreased with a decrease in the retiree percentage and increased with an increase in the young percentage. Lessons learned from this study not only can help with managing the COVID-19 pandemic but also help with early and effective management of possible future pandemics. The project used the HPCC Systems platform for collecting, hosting, and analyzing the data.

...

  • Deploying Digital Human Readers Leveraging HPCC Systems
    David de Hilster, LexisNexis Risk Solutions Group

    With the newly launched NLP-Plugin for HPCC Systems and VSCode NLP Language Extension, the community now has the ability to incorporate human-like “digital readers” into HPCC Systems to mine information from free text that has up until now, been impossible to extract. Future projects will be discussed including reading radiology reports, business reports, and real estate documents the latter of which could open new markets across the industry. It is important for everyone to understand this new technology in order to spot potential applications for extracting unmined data that until now, was impossible to obtain. Sharing our own use case, the end goal is to create a NLP Center of Excellence that will serve the entire company with digital readers first in English, then, other languages to open new streams of revenue.
  • HPCC Systems Thor Monitor
    Using Workunit Services and Power BI to Monitor Thor Activity, Jessica Skaggs, LexisNexis Risk Solutions Group

    The ECL Workunit Services standard library functions can be used to capture details about workunits running on Thor including processing time, errors, current state, and more. Capturing these details allows for monitoring, trending, error analysis, degradation, and other data points that can help improve the efficiency of your Thor environments. We will look at how to use this information to monitor the system with visualizations in Power BI.
  • Cooperative actions between University of São Paulo and LexisNexis Risk Solutions
    Renato de Oliveira Moraes, University of São Paulo

    Prof. Renato discusses the successful conjoint initiatives being held between University of São Paulo (USP) and LexisNexis Risk Solutions in Brazil for leveraging HPCC Systems for teaching & learning, research and extensions activities in academia, including recent machine learning projects.

  • Processing Student Image Data with Kubernetes and HPCC Systems GNN on the Cloud
    Carina Wang, American Heritage School and HPCC Systems Intern 2021

    In order to foster a safe learning environment, measures to bolster campus security have emerged as a top priority around the world. In this session, I will share how HPCC Systems was leveraged to process student images with Kubernetes running on the Cloud Native Platform while utilizing the Generalized Neural Network (GNN) bundle for image classification. The result is a trained model which can be implemented on the autonomous security robot we built to help campus security personnel identify visitors, students, and staff.

  • Athlete 360: Leveraging HPCC Systems and RealBI for Athlete Wellness and Performance
    Christopher Connelly, North Carolina State University and HPCC Systems Intern 2021
    There is a lot that plays into an athlete being able to perform at their best when it matters most. Not only are there physical demands, but factors that come from outside of their sport that affect their wellbeing and readiness to perform. In team sports, there are many external variables that cannot be controlled, which makes the process of gauging performance of individual athletes difficult. The better the understanding of what an athlete does and how their body responds, the better we can support them to be at their best. Within collegiate athletics, and sports in general, there is a struggle to be able to interpret data from different streams together in a single report. Furthermore, streamlined data collection, can further aid our understanding of what an athlete does and how their body responds. This involves data from all aspects of an athlete’s day including wellness questionnaires, practice training loads, weight room training loads, and weight room assessments of strength, power, and fatigue. In the past we have shown the impact of using HPCC Systems with the NC State Men’s soccer team. Here you will see some solutions using HPCC Systems and RealBI to provide insight from data collected with the NC State Women's basketball team as well as how this system can serve not only the Strength and Conditioning department, but the athletics department as a whole.

...