/
Sarah Nash - 2023 Poster Contest Resources

Sarah Nash - 2023 Poster Contest Resources

Sarah is a Master's student entering her second year of studying Applied Data Science at New College of Florida.

Poster Abstract

Causality is a growing field centered around detecting cause and effect relationships within observational data. It is a common saying that “correlation does not imply causation.” However, correlation does imply causation — perhaps in a different way than we expected, and between different variables than we were originally observing. Causal discovery and validation methods are focused at uncovering the causal relationships within large datasets, determining where cause and effect relationships appear between the variables. When there is a causal relationship generating our data, there are subtle hints left behind marking the existence of that relationship. By tuning in to these signals, we are able to discover causal relations from that data with the help of a number of different causal discovery methods. The HPCC Systems Causality Framework “Because” is a toolkit for multiple areas of causal analysis, including discovery and validation. The discovery algorithms previously implemented in the toolkit are mainly compatible with two data types: continuous numeric and discrete numeric. This project’s focus is to expand the discovery portion of the toolkit to additionally handle the remaining data type: categorical data.

This task had multiple parts:

  • Creating a framework for generating categorical data

  • Implementation of a particular model, the Uniform Channel Model, within the Causality toolkit

  • Testing with data generated from the Synthetic Data Generation subpackage of the

  • Causality toolkitTesting with real CDC data

In all, we were able to determine strengths and weaknesses of this particular model through various tests, as well as areas for improvement within the Causality toolkit. By the end of this project, we have created a foundation for generating and testing categorical or binary data, expanded the toolkit’s causal analysis capabilities, and made room for additional categorical discovery methods to be added in the future

Presentation

In this Video Recording, Sarah provides a tour and explanation of his poster content.

Causal Discovery and Validation with Categorical Data

Click on the poster for a larger image.

Related content

Nivedha Sivakumar - 2023 Poster Contest Resources
Nivedha Sivakumar - 2023 Poster Contest Resources
Read with this
Logan Patterson - 2023 Poster Contest Resources
Logan Patterson - 2023 Poster Contest Resources
More like this
Fulvio Favilla Filho - 2023 Poster Contest Resources
Fulvio Favilla Filho - 2023 Poster Contest Resources
Read with this
Causal Discovery Algorithms
Causal Discovery Algorithms
More like this
Winners - 2023 HPCC Systems Poster Contest
Winners - 2023 HPCC Systems Poster Contest
Read with this
Causality and Casual Machine Learning
Causality and Casual Machine Learning
More like this

All pages in this wiki are subject to our site usage guidelines.